All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] barriers using data dependency
@ 2019-01-02 20:57 ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

So as explained in Documentation/memory-barriers.txt e.g.
a load followed by a store require a full memory barrier,
to avoid store being ordered before the load.
Similarly load-load requires a read memory barrier.

Thinking about it, we can actually create a data dependency
by mixing the first loaded value into the pointer being
accessed.

This adds an API for this and uses it in virtio.

Written over the holiday and build tested only so far.

This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.

Sending out for early feedback/flames.

Michael S. Tsirkin (4):
  include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  include/linux/compiler.h: allow memory operands
  barriers: convert a control to a data dependency
  virtio: use dependent_ptr_mb

 Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
 arch/alpha/include/asm/barrier.h  |  1 +
 drivers/virtio/virtio_ring.c      |  6 ++++--
 include/asm-generic/barrier.h     | 18 ++++++++++++++++++
 include/linux/compiler-clang.h    |  5 ++---
 include/linux/compiler-gcc.h      |  4 ----
 include/linux/compiler-intel.h    |  4 +---
 include/linux/compiler.h          |  8 +++++++-
 8 files changed, 53 insertions(+), 13 deletions(-)

-- 
MST


^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH RFC 0/4] barriers using data dependency
@ 2019-01-02 20:57 ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrea Parri, linux-arch, Paul E. McKenney, Peter Zijlstra,
	Daniel Lustig, Akira Yokosawa, Will Deacon, Nicholas Piggin,
	virtualization, David Howells, Alan Stern, netdev, Luc Maranget,
	Jade Alglave, Boqun Feng

So as explained in Documentation/memory-barriers.txt e.g.
a load followed by a store require a full memory barrier,
to avoid store being ordered before the load.
Similarly load-load requires a read memory barrier.

Thinking about it, we can actually create a data dependency
by mixing the first loaded value into the pointer being
accessed.

This adds an API for this and uses it in virtio.

Written over the holiday and build tested only so far.

This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.

Sending out for early feedback/flames.

Michael S. Tsirkin (4):
  include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  include/linux/compiler.h: allow memory operands
  barriers: convert a control to a data dependency
  virtio: use dependent_ptr_mb

 Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
 arch/alpha/include/asm/barrier.h  |  1 +
 drivers/virtio/virtio_ring.c      |  6 ++++--
 include/asm-generic/barrier.h     | 18 ++++++++++++++++++
 include/linux/compiler-clang.h    |  5 ++---
 include/linux/compiler-gcc.h      |  4 ----
 include/linux/compiler-intel.h    |  4 +---
 include/linux/compiler.h          |  8 +++++++-
 8 files changed, 53 insertions(+), 13 deletions(-)

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-02 20:57 ` Michael S. Tsirkin
  (?)
@ 2019-01-02 20:57   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization, Eli Friedman,
	Joe Perches, Nick Desaulniers, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse

Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
from compiler-gcc - instead it gets the version in
include/linux/compiler.h.  Unfortunately that version doesn't actually
prevent compiler from optimizing out the variable.

Fix up by moving the macro out from compiler-gcc.h to compiler.h.
Compilers without incline asm support will keep working
since it's protected by an ifdef.

Also fix up comments to match reality since we are no longer overriding
any macros.

Build-tested with gcc and clang.

Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Cc: Eli Friedman <efriedma@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/compiler-clang.h | 5 ++---
 include/linux/compiler-gcc.h   | 4 ----
 include/linux/compiler-intel.h | 4 +---
 include/linux/compiler.h       | 4 +++-
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 3e7dafb3ea80..7ddaeb5182e3 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -3,9 +3,8 @@
 #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
 #endif
 
-/* Some compiler specific definitions are overwritten here
- * for Clang compiler
- */
+/* Compiler specific definitions for Clang compiler */
+
 #define uninitialized_var(x) x = *(&(x))
 
 /* same as gcc, this was present in clang-2.6 so we can assume it works
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 2010493e1040..72054d9f0eaa 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -58,10 +58,6 @@
 	(typeof(ptr)) (__ptr + (off));					\
 })
 
-/* Make the optimizer believe the variable can be manipulated arbitrarily. */
-#define OPTIMIZER_HIDE_VAR(var)						\
-	__asm__ ("" : "=r" (var) : "0" (var))
-
 /*
  * A trick to suppress uninitialized variable warning without generating any
  * code
diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
index 517bd14e1222..b17f3cd18334 100644
--- a/include/linux/compiler-intel.h
+++ b/include/linux/compiler-intel.h
@@ -5,9 +5,7 @@
 
 #ifdef __ECC
 
-/* Some compiler specific definitions are overwritten here
- * for Intel ECC compiler
- */
+/* Compiler specific definitions for Intel ECC compiler */
 
 #include <asm/intrinsics.h>
 
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 06396c1cf127..1ad367b4cd8d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
-#define OPTIMIZER_HIDE_VAR(var) barrier()
+/* Make the optimizer believe the variable can be manipulated arbitrarily. */
+#define OPTIMIZER_HIDE_VAR(var)						\
+	__asm__ ("" : "=r" (var) : "0" (var))
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-02 20:57   ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization, Eli Friedman,
	Joe Perches, Nick Desaulniers, Linus Torvalds

Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
from compiler-gcc - instead it gets the version in
include/linux/compiler.h.  Unfortunately that version doesn't actually
prevent compiler from optimizing out the variable.

Fix up by moving the macro out from compiler-gcc.h to compiler.h.
Compilers without incline asm support will keep working
since it's protected by an ifdef.

Also fix up comments to match reality since we are no longer overriding
any macros.

Build-tested with gcc and clang.

Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Cc: Eli Friedman <efriedma@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/compiler-clang.h | 5 ++---
 include/linux/compiler-gcc.h   | 4 ----
 include/linux/compiler-intel.h | 4 +---
 include/linux/compiler.h       | 4 +++-
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 3e7dafb3ea80..7ddaeb5182e3 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -3,9 +3,8 @@
 #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
 #endif
 
-/* Some compiler specific definitions are overwritten here
- * for Clang compiler
- */
+/* Compiler specific definitions for Clang compiler */
+
 #define uninitialized_var(x) x = *(&(x))
 
 /* same as gcc, this was present in clang-2.6 so we can assume it works
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 2010493e1040..72054d9f0eaa 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -58,10 +58,6 @@
 	(typeof(ptr)) (__ptr + (off));					\
 })
 
-/* Make the optimizer believe the variable can be manipulated arbitrarily. */
-#define OPTIMIZER_HIDE_VAR(var)						\
-	__asm__ ("" : "=r" (var) : "0" (var))
-
 /*
  * A trick to suppress uninitialized variable warning without generating any
  * code
diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
index 517bd14e1222..b17f3cd18334 100644
--- a/include/linux/compiler-intel.h
+++ b/include/linux/compiler-intel.h
@@ -5,9 +5,7 @@
 
 #ifdef __ECC
 
-/* Some compiler specific definitions are overwritten here
- * for Intel ECC compiler
- */
+/* Compiler specific definitions for Intel ECC compiler */
 
 #include <asm/intrinsics.h>
 
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 06396c1cf127..1ad367b4cd8d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
-#define OPTIMIZER_HIDE_VAR(var) barrier()
+/* Make the optimizer believe the variable can be manipulated arbitrarily. */
+#define OPTIMIZER_HIDE_VAR(var)						\
+	__asm__ ("" : "=r" (var) : "0" (var))
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-02 20:57   ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization, Eli Friedman,
	Joe Perches, Nick Desaulniers, Linus Torvalds

Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
from compiler-gcc - instead it gets the version in
include/linux/compiler.h.  Unfortunately that version doesn't actually
prevent compiler from optimizing out the variable.

Fix up by moving the macro out from compiler-gcc.h to compiler.h.
Compilers without incline asm support will keep working
since it's protected by an ifdef.

Also fix up comments to match reality since we are no longer overriding
any macros.

Build-tested with gcc and clang.

Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Cc: Eli Friedman <efriedma@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/compiler-clang.h | 5 ++---
 include/linux/compiler-gcc.h   | 4 ----
 include/linux/compiler-intel.h | 4 +---
 include/linux/compiler.h       | 4 +++-
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 3e7dafb3ea80..7ddaeb5182e3 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -3,9 +3,8 @@
 #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
 #endif
 
-/* Some compiler specific definitions are overwritten here
- * for Clang compiler
- */
+/* Compiler specific definitions for Clang compiler */
+
 #define uninitialized_var(x) x = *(&(x))
 
 /* same as gcc, this was present in clang-2.6 so we can assume it works
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 2010493e1040..72054d9f0eaa 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -58,10 +58,6 @@
 	(typeof(ptr)) (__ptr + (off));					\
 })
 
-/* Make the optimizer believe the variable can be manipulated arbitrarily. */
-#define OPTIMIZER_HIDE_VAR(var)						\
-	__asm__ ("" : "=r" (var) : "0" (var))
-
 /*
  * A trick to suppress uninitialized variable warning without generating any
  * code
diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
index 517bd14e1222..b17f3cd18334 100644
--- a/include/linux/compiler-intel.h
+++ b/include/linux/compiler-intel.h
@@ -5,9 +5,7 @@
 
 #ifdef __ECC
 
-/* Some compiler specific definitions are overwritten here
- * for Intel ECC compiler
- */
+/* Compiler specific definitions for Intel ECC compiler */
 
 #include <asm/intrinsics.h>
 
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 06396c1cf127..1ad367b4cd8d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
-#define OPTIMIZER_HIDE_VAR(var) barrier()
+/* Make the optimizer believe the variable can be manipulated arbitrarily. */
+#define OPTIMIZER_HIDE_VAR(var)						\
+	__asm__ ("" : "=r" (var) : "0" (var))
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-02 20:57 ` Michael S. Tsirkin
  (?)
  (?)
@ 2019-01-02 20:57 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrea Parri, Peter Zijlstra, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, linux-sparse,
	Alan Stern, Paul E. McKenney, Boqun Feng, Daniel Lustig,
	Nicholas Piggin, Luc Maranget, Eli Friedman, Jade Alglave,
	netdev, Nick Desaulniers, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck

Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
from compiler-gcc - instead it gets the version in
include/linux/compiler.h.  Unfortunately that version doesn't actually
prevent compiler from optimizing out the variable.

Fix up by moving the macro out from compiler-gcc.h to compiler.h.
Compilers without incline asm support will keep working
since it's protected by an ifdef.

Also fix up comments to match reality since we are no longer overriding
any macros.

Build-tested with gcc and clang.

Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
Cc: Eli Friedman <efriedma@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/compiler-clang.h | 5 ++---
 include/linux/compiler-gcc.h   | 4 ----
 include/linux/compiler-intel.h | 4 +---
 include/linux/compiler.h       | 4 +++-
 4 files changed, 6 insertions(+), 11 deletions(-)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 3e7dafb3ea80..7ddaeb5182e3 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -3,9 +3,8 @@
 #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
 #endif
 
-/* Some compiler specific definitions are overwritten here
- * for Clang compiler
- */
+/* Compiler specific definitions for Clang compiler */
+
 #define uninitialized_var(x) x = *(&(x))
 
 /* same as gcc, this was present in clang-2.6 so we can assume it works
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 2010493e1040..72054d9f0eaa 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -58,10 +58,6 @@
 	(typeof(ptr)) (__ptr + (off));					\
 })
 
-/* Make the optimizer believe the variable can be manipulated arbitrarily. */
-#define OPTIMIZER_HIDE_VAR(var)						\
-	__asm__ ("" : "=r" (var) : "0" (var))
-
 /*
  * A trick to suppress uninitialized variable warning without generating any
  * code
diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
index 517bd14e1222..b17f3cd18334 100644
--- a/include/linux/compiler-intel.h
+++ b/include/linux/compiler-intel.h
@@ -5,9 +5,7 @@
 
 #ifdef __ECC
 
-/* Some compiler specific definitions are overwritten here
- * for Intel ECC compiler
- */
+/* Compiler specific definitions for Intel ECC compiler */
 
 #include <asm/intrinsics.h>
 
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 06396c1cf127..1ad367b4cd8d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
-#define OPTIMIZER_HIDE_VAR(var) barrier()
+/* Make the optimizer believe the variable can be manipulated arbitrarily. */
+#define OPTIMIZER_HIDE_VAR(var)						\
+	__asm__ ("" : "=r" (var) : "0" (var))
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands
  2019-01-02 20:57 ` Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  (?)
@ 2019-01-02 20:57 ` Michael S. Tsirkin
  2019-01-07 17:54     ` Will Deacon
  -1 siblings, 1 reply; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Luc Van Oostenryck, linux-sparse

We don't really care whether the variable is in-register
or in-memory. Relax the constraint accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/compiler.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 1ad367b4cd8d..6601d39e8c48 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -154,7 +154,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #ifndef OPTIMIZER_HIDE_VAR
 /* Make the optimizer believe the variable can be manipulated arbitrarily. */
 #define OPTIMIZER_HIDE_VAR(var)						\
-	__asm__ ("" : "=r" (var) : "0" (var))
+	__asm__ ("" : "=rm" (var) : "0" (var))
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands
  2019-01-02 20:57 ` Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  (?)
@ 2019-01-02 20:57 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrea Parri, linux-arch, Paul E. McKenney, Peter Zijlstra,
	Daniel Lustig, Akira Yokosawa, Will Deacon, Nicholas Piggin,
	virtualization, David Howells, linux-sparse, Alan Stern, netdev,
	Luc Maranget, Jade Alglave, Boqun Feng, Luc Van Oostenryck

We don't really care whether the variable is in-register
or in-memory. Relax the constraint accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/linux/compiler.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 1ad367b4cd8d..6601d39e8c48 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -154,7 +154,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #ifndef OPTIMIZER_HIDE_VAR
 /* Make the optimizer believe the variable can be manipulated arbitrarily. */
 #define OPTIMIZER_HIDE_VAR(var)						\
-	__asm__ ("" : "=r" (var) : "0" (var))
+	__asm__ ("" : "=rm" (var) : "0" (var))
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 20:57 ` Michael S. Tsirkin
  (?)
@ 2019-01-02 20:57   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

It's not uncommon to have two access two unrelated memory locations in a
specific order.  At the moment one has to use a memory barrier for this.

However, if the first access was a read and the second used an address
depending on the first one we would have a data dependency and no
barrier would be necessary.

This adds a new interface: dependent_ptr_mb which does exactly this: it
returns a pointer with a data dependency on the supplied value.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
 arch/alpha/include/asm/barrier.h  |  1 +
 include/asm-generic/barrier.h     | 18 ++++++++++++++++++
 include/linux/compiler.h          |  4 ++++
 4 files changed, 43 insertions(+)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c1d913944ad8..9dbaa2e1dbf6 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -691,6 +691,18 @@ case what's actually required is:
 		p = READ_ONCE(b);
 	}
 
+Alternatively, a control dependency can be converted to a data dependency,
+e.g.:
+
+	q = READ_ONCE(a);
+	if (q) {
+		b = dependent_ptr_mb(b, q);
+		p = READ_ONCE(b);
+	}
+
+Note how the result of dependent_ptr_mb must be used with the following
+accesses in order to have an effect.
+
 However, stores are not speculated.  This means that ordering -is- provided
 for load-store control dependencies, as in the following example:
 
@@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
 the compiler to actually emit code for a given load, it does not force
 the compiler to use the results.
 
+Converting to a data dependency helps with this too:
+
+	q = READ_ONCE(a);
+	b = dependent_ptr_mb(b, q);
+	WRITE_ONCE(b, 1);
+
 In addition, control dependencies apply only to the then-clause and
 else-clause of the if-statement in question.  In particular, it does
 not necessarily apply to code following the if-statement:
@@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
 for more information.
 
 
+
+
 In summary:
 
   (*) Control dependencies can order prior loads against later stores.
diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index 92ec486a4f9e..b4934e8c551b 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -59,6 +59,7 @@
  * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
  * in cases like this where there are no data dependencies.
  */
+#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
 #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
 
 #ifdef CONFIG_SMP
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 2cafdbb9ae4c..fa2e2ef72b68 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -70,6 +70,24 @@
 #define __smp_read_barrier_depends()	read_barrier_depends()
 #endif
 
+#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
+	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
+
+#define dependent_ptr_mb(ptr, val) ({					\
+	long dependent_ptr_mb_val = (long)(val);			\
+	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
+									\
+	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
+	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
+	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
+})
+
+#else
+
+#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
+
+#endif
+
 #ifdef CONFIG_SMP
 
 #ifndef smp_mb
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6601d39e8c48..f599c30f1b28 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
+
 /* Make the optimizer believe the variable can be manipulated arbitrarily. */
 #define OPTIMIZER_HIDE_VAR(var)						\
 	__asm__ ("" : "=rm" (var) : "0" (var))
+
+#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
+
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-02 20:57   ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd

It's not uncommon to have two access two unrelated memory locations in a
specific order.  At the moment one has to use a memory barrier for this.

However, if the first access was a read and the second used an address
depending on the first one we would have a data dependency and no
barrier would be necessary.

This adds a new interface: dependent_ptr_mb which does exactly this: it
returns a pointer with a data dependency on the supplied value.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
 arch/alpha/include/asm/barrier.h  |  1 +
 include/asm-generic/barrier.h     | 18 ++++++++++++++++++
 include/linux/compiler.h          |  4 ++++
 4 files changed, 43 insertions(+)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c1d913944ad8..9dbaa2e1dbf6 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -691,6 +691,18 @@ case what's actually required is:
 		p = READ_ONCE(b);
 	}
 
+Alternatively, a control dependency can be converted to a data dependency,
+e.g.:
+
+	q = READ_ONCE(a);
+	if (q) {
+		b = dependent_ptr_mb(b, q);
+		p = READ_ONCE(b);
+	}
+
+Note how the result of dependent_ptr_mb must be used with the following
+accesses in order to have an effect.
+
 However, stores are not speculated.  This means that ordering -is- provided
 for load-store control dependencies, as in the following example:
 
@@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
 the compiler to actually emit code for a given load, it does not force
 the compiler to use the results.
 
+Converting to a data dependency helps with this too:
+
+	q = READ_ONCE(a);
+	b = dependent_ptr_mb(b, q);
+	WRITE_ONCE(b, 1);
+
 In addition, control dependencies apply only to the then-clause and
 else-clause of the if-statement in question.  In particular, it does
 not necessarily apply to code following the if-statement:
@@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
 for more information.
 
 
+
+
 In summary:
 
   (*) Control dependencies can order prior loads against later stores.
diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index 92ec486a4f9e..b4934e8c551b 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -59,6 +59,7 @@
  * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
  * in cases like this where there are no data dependencies.
  */
+#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
 #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
 
 #ifdef CONFIG_SMP
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 2cafdbb9ae4c..fa2e2ef72b68 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -70,6 +70,24 @@
 #define __smp_read_barrier_depends()	read_barrier_depends()
 #endif
 
+#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
+	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
+
+#define dependent_ptr_mb(ptr, val) ({					\
+	long dependent_ptr_mb_val = (long)(val);			\
+	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
+									\
+	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
+	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
+	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
+})
+
+#else
+
+#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
+
+#endif
+
 #ifdef CONFIG_SMP
 
 #ifndef smp_mb
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6601d39e8c48..f599c30f1b28 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
+
 /* Make the optimizer believe the variable can be manipulated arbitrarily. */
 #define OPTIMIZER_HIDE_VAR(var)						\
 	__asm__ ("" : "=rm" (var) : "0" (var))
+
+#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
+
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-02 20:57   ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd

It's not uncommon to have two access two unrelated memory locations in a
specific order.  At the moment one has to use a memory barrier for this.

However, if the first access was a read and the second used an address
depending on the first one we would have a data dependency and no
barrier would be necessary.

This adds a new interface: dependent_ptr_mb which does exactly this: it
returns a pointer with a data dependency on the supplied value.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
 arch/alpha/include/asm/barrier.h  |  1 +
 include/asm-generic/barrier.h     | 18 ++++++++++++++++++
 include/linux/compiler.h          |  4 ++++
 4 files changed, 43 insertions(+)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c1d913944ad8..9dbaa2e1dbf6 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -691,6 +691,18 @@ case what's actually required is:
 		p = READ_ONCE(b);
 	}
 
+Alternatively, a control dependency can be converted to a data dependency,
+e.g.:
+
+	q = READ_ONCE(a);
+	if (q) {
+		b = dependent_ptr_mb(b, q);
+		p = READ_ONCE(b);
+	}
+
+Note how the result of dependent_ptr_mb must be used with the following
+accesses in order to have an effect.
+
 However, stores are not speculated.  This means that ordering -is- provided
 for load-store control dependencies, as in the following example:
 
@@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
 the compiler to actually emit code for a given load, it does not force
 the compiler to use the results.
 
+Converting to a data dependency helps with this too:
+
+	q = READ_ONCE(a);
+	b = dependent_ptr_mb(b, q);
+	WRITE_ONCE(b, 1);
+
 In addition, control dependencies apply only to the then-clause and
 else-clause of the if-statement in question.  In particular, it does
 not necessarily apply to code following the if-statement:
@@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
 for more information.
 
 
+
+
 In summary:
 
   (*) Control dependencies can order prior loads against later stores.
diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index 92ec486a4f9e..b4934e8c551b 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -59,6 +59,7 @@
  * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
  * in cases like this where there are no data dependencies.
  */
+#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
 #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
 
 #ifdef CONFIG_SMP
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 2cafdbb9ae4c..fa2e2ef72b68 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -70,6 +70,24 @@
 #define __smp_read_barrier_depends()	read_barrier_depends()
 #endif
 
+#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
+	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
+
+#define dependent_ptr_mb(ptr, val) ({					\
+	long dependent_ptr_mb_val = (long)(val);			\
+	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
+									\
+	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
+	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
+	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
+})
+
+#else
+
+#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
+
+#endif
+
 #ifdef CONFIG_SMP
 
 #ifndef smp_mb
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6601d39e8c48..f599c30f1b28 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
+
 /* Make the optimizer believe the variable can be manipulated arbitrarily. */
 #define OPTIMIZER_HIDE_VAR(var)						\
 	__asm__ ("" : "=rm" (var) : "0" (var))
+
+#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
+
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 20:57 ` Michael S. Tsirkin
                   ` (5 preceding siblings ...)
  (?)
@ 2019-01-02 20:57 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:57 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Boqun Feng, Arnd Bergmann, Daniel Lustig,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-alpha,
	Luc Van Oostenryck

It's not uncommon to have two access two unrelated memory locations in a
specific order.  At the moment one has to use a memory barrier for this.

However, if the first access was a read and the second used an address
depending on the first one we would have a data dependency and no
barrier would be necessary.

This adds a new interface: dependent_ptr_mb which does exactly this: it
returns a pointer with a data dependency on the supplied value.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
 arch/alpha/include/asm/barrier.h  |  1 +
 include/asm-generic/barrier.h     | 18 ++++++++++++++++++
 include/linux/compiler.h          |  4 ++++
 4 files changed, 43 insertions(+)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c1d913944ad8..9dbaa2e1dbf6 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -691,6 +691,18 @@ case what's actually required is:
 		p = READ_ONCE(b);
 	}
 
+Alternatively, a control dependency can be converted to a data dependency,
+e.g.:
+
+	q = READ_ONCE(a);
+	if (q) {
+		b = dependent_ptr_mb(b, q);
+		p = READ_ONCE(b);
+	}
+
+Note how the result of dependent_ptr_mb must be used with the following
+accesses in order to have an effect.
+
 However, stores are not speculated.  This means that ordering -is- provided
 for load-store control dependencies, as in the following example:
 
@@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
 the compiler to actually emit code for a given load, it does not force
 the compiler to use the results.
 
+Converting to a data dependency helps with this too:
+
+	q = READ_ONCE(a);
+	b = dependent_ptr_mb(b, q);
+	WRITE_ONCE(b, 1);
+
 In addition, control dependencies apply only to the then-clause and
 else-clause of the if-statement in question.  In particular, it does
 not necessarily apply to code following the if-statement:
@@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
 for more information.
 
 
+
+
 In summary:
 
   (*) Control dependencies can order prior loads against later stores.
diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
index 92ec486a4f9e..b4934e8c551b 100644
--- a/arch/alpha/include/asm/barrier.h
+++ b/arch/alpha/include/asm/barrier.h
@@ -59,6 +59,7 @@
  * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
  * in cases like this where there are no data dependencies.
  */
+#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
 #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
 
 #ifdef CONFIG_SMP
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 2cafdbb9ae4c..fa2e2ef72b68 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -70,6 +70,24 @@
 #define __smp_read_barrier_depends()	read_barrier_depends()
 #endif
 
+#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
+	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
+
+#define dependent_ptr_mb(ptr, val) ({					\
+	long dependent_ptr_mb_val = (long)(val);			\
+	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
+									\
+	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
+	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
+	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
+})
+
+#else
+
+#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
+
+#endif
+
 #ifdef CONFIG_SMP
 
 #ifndef smp_mb
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 6601d39e8c48..f599c30f1b28 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #endif
 
 #ifndef OPTIMIZER_HIDE_VAR
+
 /* Make the optimizer believe the variable can be manipulated arbitrarily. */
 #define OPTIMIZER_HIDE_VAR(var)						\
 	__asm__ ("" : "=rm" (var) : "0" (var))
+
+#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
+
 #endif
 
 /* Not-quite-unique ID. */
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 4/4] virtio: use dependent_ptr_mb
  2019-01-02 20:57 ` Michael S. Tsirkin
                   ` (7 preceding siblings ...)
  (?)
@ 2019-01-02 20:58 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

Use dependent_ptr_mb which is - on some architectures -
more light-weight than an rmb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 814b395007b2..2d320396eff8 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -702,6 +702,7 @@ void *virtqueue_get_buf_ctx(struct virtqueue *_vq, unsigned int *len,
 	void *ret;
 	unsigned int i;
 	u16 last_used;
+	bool more;
 
 	START_USE(vq);
 
@@ -710,14 +711,15 @@ void *virtqueue_get_buf_ctx(struct virtqueue *_vq, unsigned int *len,
 		return NULL;
 	}
 
-	if (!more_used(vq)) {
+	more = more_used(vq);
+	if (!more) {
 		pr_debug("No more buffers in queue\n");
 		END_USE(vq);
 		return NULL;
 	}
 
 	/* Only get used array entries after they have been exposed by host. */
-	virtio_rmb(vq->weak_barriers);
+	vq = dependent_ptr_mb(vq, more);
 
 	last_used = (vq->last_used_idx & (vq->vring.num - 1));
 	i = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].id);
-- 
MST


^ permalink raw reply related	[flat|nested] 94+ messages in thread

* [PATCH RFC 4/4] virtio: use dependent_ptr_mb
  2019-01-02 20:57 ` Michael S. Tsirkin
                   ` (6 preceding siblings ...)
  (?)
@ 2019-01-02 20:58 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 20:58 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrea Parri, linux-arch, Paul E. McKenney, Peter Zijlstra,
	Daniel Lustig, Akira Yokosawa, Will Deacon, Nicholas Piggin,
	virtualization, David Howells, Alan Stern, netdev, Luc Maranget,
	Jade Alglave, Boqun Feng

Use dependent_ptr_mb which is - on some architectures -
more light-weight than an rmb.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/virtio/virtio_ring.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 814b395007b2..2d320396eff8 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -702,6 +702,7 @@ void *virtqueue_get_buf_ctx(struct virtqueue *_vq, unsigned int *len,
 	void *ret;
 	unsigned int i;
 	u16 last_used;
+	bool more;
 
 	START_USE(vq);
 
@@ -710,14 +711,15 @@ void *virtqueue_get_buf_ctx(struct virtqueue *_vq, unsigned int *len,
 		return NULL;
 	}
 
-	if (!more_used(vq)) {
+	more = more_used(vq);
+	if (!more) {
 		pr_debug("No more buffers in queue\n");
 		END_USE(vq);
 		return NULL;
 	}
 
 	/* Only get used array entries after they have been exposed by host. */
-	virtio_rmb(vq->weak_barriers);
+	vq = dependent_ptr_mb(vq, more);
 
 	last_used = (vq->last_used_idx & (vq->vring.num - 1));
 	i = virtio32_to_cpu(_vq->vdev, vq->vring.used->ring[last_used].id);
-- 
MST

^ permalink raw reply related	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 20:57   ` Michael S. Tsirkin
  (?)
@ 2019-01-02 21:00     ` Matthew Wilcox
  -1 siblings, 0 replies; 94+ messages in thread
From: Matthew Wilcox @ 2019-01-02 21:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>  for more information.
>  
>  
> +
> +
>  In summary:
>  
>    (*) Control dependencies can order prior loads against later stores.

Was this hunk intentional?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-02 21:00     ` Matthew Wilcox
  0 siblings, 0 replies; 94+ messages in thread
From: Matthew Wilcox @ 2019-01-02 21:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner

On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>  for more information.
>  
>  
> +
> +
>  In summary:
>  
>    (*) Control dependencies can order prior loads against later stores.

Was this hunk intentional?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-02 21:00     ` Matthew Wilcox
  0 siblings, 0 replies; 94+ messages in thread
From: Matthew Wilcox @ 2019-01-02 21:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner

On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>  for more information.
>  
>  
> +
> +
>  In summary:
>  
>    (*) Control dependencies can order prior loads against later stores.

Was this hunk intentional?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 20:57   ` Michael S. Tsirkin
  (?)
  (?)
@ 2019-01-02 21:00   ` Matthew Wilcox
  -1 siblings, 0 replies; 94+ messages in thread
From: Matthew Wilcox @ 2019-01-02 21:00 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Boqun Feng, Arnd Bergmann, Daniel Lustig,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha

On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>  for more information.
>  
>  
> +
> +
>  In summary:
>  
>    (*) Control dependencies can order prior loads against later stores.

Was this hunk intentional?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 21:00     ` Matthew Wilcox
  (?)
@ 2019-01-02 21:24       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 21:24 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

On Wed, Jan 02, 2019 at 01:00:24PM -0800, Matthew Wilcox wrote:
> On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >  for more information.
> >  
> >  
> > +
> > +
> >  In summary:
> >  
> >    (*) Control dependencies can order prior loads against later stores.
> 
> Was this hunk intentional?

Nope, thanks for catching this.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-02 21:24       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 21:24 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner

On Wed, Jan 02, 2019 at 01:00:24PM -0800, Matthew Wilcox wrote:
> On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >  for more information.
> >  
> >  
> > +
> > +
> >  In summary:
> >  
> >    (*) Control dependencies can order prior loads against later stores.
> 
> Was this hunk intentional?

Nope, thanks for catching this.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-02 21:24       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 21:24 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner

On Wed, Jan 02, 2019 at 01:00:24PM -0800, Matthew Wilcox wrote:
> On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >  for more information.
> >  
> >  
> > +
> > +
> >  In summary:
> >  
> >    (*) Control dependencies can order prior loads against later stores.
> 
> Was this hunk intentional?

Nope, thanks for catching this.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 21:00     ` Matthew Wilcox
  (?)
  (?)
@ 2019-01-02 21:24     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 21:24 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Boqun Feng, Arnd Bergmann, Daniel Lustig,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha

On Wed, Jan 02, 2019 at 01:00:24PM -0800, Matthew Wilcox wrote:
> On Wed, Jan 02, 2019 at 03:57:58PM -0500, Michael S. Tsirkin wrote:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >  for more information.
> >  
> >  
> > +
> > +
> >  In summary:
> >  
> >    (*) Control dependencies can order prior loads against later stores.
> 
> Was this hunk intentional?

Nope, thanks for catching this.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 0/4] barriers using data dependency
  2019-01-02 20:57 ` Michael S. Tsirkin
@ 2019-01-02 21:36   ` Alan Stern
  -1 siblings, 0 replies; 94+ messages in thread
From: Alan Stern @ 2019-01-02 21:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:

> So as explained in Documentation/memory-barriers.txt e.g.
> a load followed by a store require a full memory barrier,
> to avoid store being ordered before the load.
> Similarly load-load requires a read memory barrier.
> 
> Thinking about it, we can actually create a data dependency
> by mixing the first loaded value into the pointer being
> accessed.
> 
> This adds an API for this and uses it in virtio.
> 
> Written over the holiday and build tested only so far.

You are using the terminology from memory-barriers.txt, referring to
the new dependency you create as a data dependency.  However,
tools/memory-model/* uses a more precise name, calling it an address
dependency.  Could you change the comments in the patches to use this
name instead?

> This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.

This should be easy to fix with an architecture-specific override.

Alan Stern

> Sending out for early feedback/flames.
> 
> Michael S. Tsirkin (4):
>   include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
>   include/linux/compiler.h: allow memory operands
>   barriers: convert a control to a data dependency
>   virtio: use dependent_ptr_mb
> 
>  Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>  arch/alpha/include/asm/barrier.h  |  1 +
>  drivers/virtio/virtio_ring.c      |  6 ++++--
>  include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>  include/linux/compiler-clang.h    |  5 ++---
>  include/linux/compiler-gcc.h      |  4 ----
>  include/linux/compiler-intel.h    |  4 +---
>  include/linux/compiler.h          |  8 +++++++-
>  8 files changed, 53 insertions(+), 13 deletions(-)


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 0/4] barriers using data dependency
@ 2019-01-02 21:36   ` Alan Stern
  0 siblings, 0 replies; 94+ messages in thread
From: Alan Stern @ 2019-01-02 21:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:

> So as explained in Documentation/memory-barriers.txt e.g.
> a load followed by a store require a full memory barrier,
> to avoid store being ordered before the load.
> Similarly load-load requires a read memory barrier.
> 
> Thinking about it, we can actually create a data dependency
> by mixing the first loaded value into the pointer being
> accessed.
> 
> This adds an API for this and uses it in virtio.
> 
> Written over the holiday and build tested only so far.

You are using the terminology from memory-barriers.txt, referring to
the new dependency you create as a data dependency.  However,
tools/memory-model/* uses a more precise name, calling it an address
dependency.  Could you change the comments in the patches to use this
name instead?

> This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.

This should be easy to fix with an architecture-specific override.

Alan Stern

> Sending out for early feedback/flames.
> 
> Michael S. Tsirkin (4):
>   include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
>   include/linux/compiler.h: allow memory operands
>   barriers: convert a control to a data dependency
>   virtio: use dependent_ptr_mb
> 
>  Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>  arch/alpha/include/asm/barrier.h  |  1 +
>  drivers/virtio/virtio_ring.c      |  6 ++++--
>  include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>  include/linux/compiler-clang.h    |  5 ++---
>  include/linux/compiler-gcc.h      |  4 ----
>  include/linux/compiler-intel.h    |  4 +---
>  include/linux/compiler.h          |  8 +++++++-
>  8 files changed, 53 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 0/4] barriers using data dependency
  2019-01-02 21:36   ` Alan Stern
@ 2019-01-02 23:04     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 23:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: linux-kernel, Jason Wang, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

On Wed, Jan 02, 2019 at 04:36:40PM -0500, Alan Stern wrote:
> On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:
> 
> > So as explained in Documentation/memory-barriers.txt e.g.
> > a load followed by a store require a full memory barrier,
> > to avoid store being ordered before the load.
> > Similarly load-load requires a read memory barrier.
> > 
> > Thinking about it, we can actually create a data dependency
> > by mixing the first loaded value into the pointer being
> > accessed.
> > 
> > This adds an API for this and uses it in virtio.
> > 
> > Written over the holiday and build tested only so far.
> 
> You are using the terminology from memory-barriers.txt, referring to
> the new dependency you create as a data dependency.  However,
> tools/memory-model/* uses a more precise name, calling it an address
> dependency.  Could you change the comments in the patches to use this
> name instead?

Sure, sounds good. While I'm at it, should memory-barriers.txt be
switched over too?

> > This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.
> 
> This should be easy to fix with an architecture-specific override.
> 
> Alan Stern

Absolutely. It does however mean that we'll need several
variants: mb/rmb, smp/dma/virt/mandatory.

I am still trying to decide whether it's good since it documents the
kind of barrier that we are trying to use - or bad since it's more
verbose and makes you choose one where they are all pretty cheap.

> > Sending out for early feedback/flames.
> > 
> > Michael S. Tsirkin (4):
> >   include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
> >   include/linux/compiler.h: allow memory operands
> >   barriers: convert a control to a data dependency
> >   virtio: use dependent_ptr_mb
> > 
> >  Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
> >  arch/alpha/include/asm/barrier.h  |  1 +
> >  drivers/virtio/virtio_ring.c      |  6 ++++--
> >  include/asm-generic/barrier.h     | 18 ++++++++++++++++++
> >  include/linux/compiler-clang.h    |  5 ++---
> >  include/linux/compiler-gcc.h      |  4 ----
> >  include/linux/compiler-intel.h    |  4 +---
> >  include/linux/compiler.h          |  8 +++++++-
> >  8 files changed, 53 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 0/4] barriers using data dependency
@ 2019-01-02 23:04     ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-02 23:04 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andrea Parri, linux-arch, Paul E. McKenney, Peter Zijlstra,
	Daniel Lustig, Akira Yokosawa, Will Deacon, linux-kernel,
	Nicholas Piggin, virtualization, David Howells, netdev,
	Luc Maranget, Jade Alglave, Boqun Feng

On Wed, Jan 02, 2019 at 04:36:40PM -0500, Alan Stern wrote:
> On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:
> 
> > So as explained in Documentation/memory-barriers.txt e.g.
> > a load followed by a store require a full memory barrier,
> > to avoid store being ordered before the load.
> > Similarly load-load requires a read memory barrier.
> > 
> > Thinking about it, we can actually create a data dependency
> > by mixing the first loaded value into the pointer being
> > accessed.
> > 
> > This adds an API for this and uses it in virtio.
> > 
> > Written over the holiday and build tested only so far.
> 
> You are using the terminology from memory-barriers.txt, referring to
> the new dependency you create as a data dependency.  However,
> tools/memory-model/* uses a more precise name, calling it an address
> dependency.  Could you change the comments in the patches to use this
> name instead?

Sure, sounds good. While I'm at it, should memory-barriers.txt be
switched over too?

> > This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.
> 
> This should be easy to fix with an architecture-specific override.
> 
> Alan Stern

Absolutely. It does however mean that we'll need several
variants: mb/rmb, smp/dma/virt/mandatory.

I am still trying to decide whether it's good since it documents the
kind of barrier that we are trying to use - or bad since it's more
verbose and makes you choose one where they are all pretty cheap.

> > Sending out for early feedback/flames.
> > 
> > Michael S. Tsirkin (4):
> >   include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
> >   include/linux/compiler.h: allow memory operands
> >   barriers: convert a control to a data dependency
> >   virtio: use dependent_ptr_mb
> > 
> >  Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
> >  arch/alpha/include/asm/barrier.h  |  1 +
> >  drivers/virtio/virtio_ring.c      |  6 ++++--
> >  include/asm-generic/barrier.h     | 18 ++++++++++++++++++
> >  include/linux/compiler-clang.h    |  5 ++---
> >  include/linux/compiler-gcc.h      |  4 ----
> >  include/linux/compiler-intel.h    |  4 +---
> >  include/linux/compiler.h          |  8 +++++++-
> >  8 files changed, 53 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 0/4] barriers using data dependency
  2019-01-02 23:04     ` Michael S. Tsirkin
@ 2019-01-03 15:11       ` Alan Stern
  -1 siblings, 0 replies; 94+ messages in thread
From: Alan Stern @ 2019-01-03 15:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:

> On Wed, Jan 02, 2019 at 04:36:40PM -0500, Alan Stern wrote:
> > On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:
> > 
> > > So as explained in Documentation/memory-barriers.txt e.g.
> > > a load followed by a store require a full memory barrier,
> > > to avoid store being ordered before the load.
> > > Similarly load-load requires a read memory barrier.
> > > 
> > > Thinking about it, we can actually create a data dependency
> > > by mixing the first loaded value into the pointer being
> > > accessed.
> > > 
> > > This adds an API for this and uses it in virtio.
> > > 
> > > Written over the holiday and build tested only so far.
> > 
> > You are using the terminology from memory-barriers.txt, referring to
> > the new dependency you create as a data dependency.  However,
> > tools/memory-model/* uses a more precise name, calling it an address
> > dependency.  Could you change the comments in the patches to use this
> > name instead?
> 
> Sure, sounds good. While I'm at it, should memory-barriers.txt be
> switched over too?

If you want to take care of that, great!  I never seem to get around to 
doing it.

> > > This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.
> > 
> > This should be easy to fix with an architecture-specific override.
> > 
> > Alan Stern
> 
> Absolutely. It does however mean that we'll need several
> variants: mb/rmb, smp/dma/virt/mandatory.
> 
> I am still trying to decide whether it's good since it documents the
> kind of barrier that we are trying to use - or bad since it's more
> verbose and makes you choose one where they are all pretty cheap.

How many places can these things be used?  My guess is not very many,
or at least, there aren't very many different _types_ of usage.  So
start only with variants you know will be used.  More can be added
later if we want.

Alan Stern


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 0/4] barriers using data dependency
@ 2019-01-03 15:11       ` Alan Stern
  0 siblings, 0 replies; 94+ messages in thread
From: Alan Stern @ 2019-01-03 15:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization

On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:

> On Wed, Jan 02, 2019 at 04:36:40PM -0500, Alan Stern wrote:
> > On Wed, 2 Jan 2019, Michael S. Tsirkin wrote:
> > 
> > > So as explained in Documentation/memory-barriers.txt e.g.
> > > a load followed by a store require a full memory barrier,
> > > to avoid store being ordered before the load.
> > > Similarly load-load requires a read memory barrier.
> > > 
> > > Thinking about it, we can actually create a data dependency
> > > by mixing the first loaded value into the pointer being
> > > accessed.
> > > 
> > > This adds an API for this and uses it in virtio.
> > > 
> > > Written over the holiday and build tested only so far.
> > 
> > You are using the terminology from memory-barriers.txt, referring to
> > the new dependency you create as a data dependency.  However,
> > tools/memory-model/* uses a more precise name, calling it an address
> > dependency.  Could you change the comments in the patches to use this
> > name instead?
> 
> Sure, sounds good. While I'm at it, should memory-barriers.txt be
> switched over too?

If you want to take care of that, great!  I never seem to get around to 
doing it.

> > > This patchset is also suboptimal on e.g. x86 where e.g. smp_rmb is a nop.
> > 
> > This should be easy to fix with an architecture-specific override.
> > 
> > Alan Stern
> 
> Absolutely. It does however mean that we'll need several
> variants: mb/rmb, smp/dma/virt/mandatory.
> 
> I am still trying to decide whether it's good since it documents the
> kind of barrier that we are trying to use - or bad since it's more
> verbose and makes you choose one where they are all pretty cheap.

How many places can these things be used?  My guess is not very many,
or at least, there aren't very many different _types_ of usage.  So
start only with variants you know will be used.  More can be added
later if we want.

Alan Stern

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-02 20:57   ` Michael S. Tsirkin
@ 2019-01-07  3:58     ` Jason Wang
  -1 siblings, 0 replies; 94+ messages in thread
From: Jason Wang @ 2019-01-07  3:58 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Alan Stern, Andrea Parri, Will Deacon, Peter Zijlstra,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann,
	Luc Van Oostenryck, linux-doc, linux-alpha, linux-sparse


On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> It's not uncommon to have two access two unrelated memory locations in a
> specific order.  At the moment one has to use a memory barrier for this.
>
> However, if the first access was a read and the second used an address
> depending on the first one we would have a data dependency and no
> barrier would be necessary.
>
> This adds a new interface: dependent_ptr_mb which does exactly this: it
> returns a pointer with a data dependency on the supplied value.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>   arch/alpha/include/asm/barrier.h  |  1 +
>   include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>   include/linux/compiler.h          |  4 ++++
>   4 files changed, 43 insertions(+)
>
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index c1d913944ad8..9dbaa2e1dbf6 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -691,6 +691,18 @@ case what's actually required is:
>   		p = READ_ONCE(b);
>   	}
>   
> +Alternatively, a control dependency can be converted to a data dependency,
> +e.g.:
> +
> +	q = READ_ONCE(a);
> +	if (q) {
> +		b = dependent_ptr_mb(b, q);
> +		p = READ_ONCE(b);
> +	}
> +
> +Note how the result of dependent_ptr_mb must be used with the following
> +accesses in order to have an effect.
> +
>   However, stores are not speculated.  This means that ordering -is- provided
>   for load-store control dependencies, as in the following example:
>   
> @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
>   the compiler to actually emit code for a given load, it does not force
>   the compiler to use the results.
>   
> +Converting to a data dependency helps with this too:
> +
> +	q = READ_ONCE(a);
> +	b = dependent_ptr_mb(b, q);
> +	WRITE_ONCE(b, 1);
> +
>   In addition, control dependencies apply only to the then-clause and
>   else-clause of the if-statement in question.  In particular, it does
>   not necessarily apply to code following the if-statement:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>   for more information.
>   
>   
> +
> +
>   In summary:
>   
>     (*) Control dependencies can order prior loads against later stores.
> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> index 92ec486a4f9e..b4934e8c551b 100644
> --- a/arch/alpha/include/asm/barrier.h
> +++ b/arch/alpha/include/asm/barrier.h
> @@ -59,6 +59,7 @@
>    * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
>    * in cases like this where there are no data dependencies.
>    */
> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
>   #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
>   
>   #ifdef CONFIG_SMP
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index 2cafdbb9ae4c..fa2e2ef72b68 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -70,6 +70,24 @@
>   #define __smp_read_barrier_depends()	read_barrier_depends()
>   #endif
>   
> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> +
> +#define dependent_ptr_mb(ptr, val) ({					\
> +	long dependent_ptr_mb_val = (long)(val);			\
> +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> +									\
> +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> +})
> +
> +#else
> +
> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })


So for the example of patch 4, we'd better fall back to rmb() or need a 
dependent_ptr_rmb()?

Thanks


> +
> +#endif
> +
>   #ifdef CONFIG_SMP
>   
>   #ifndef smp_mb
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 6601d39e8c48..f599c30f1b28 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>   #endif
>   
>   #ifndef OPTIMIZER_HIDE_VAR
> +
>   /* Make the optimizer believe the variable can be manipulated arbitrarily. */
>   #define OPTIMIZER_HIDE_VAR(var)						\
>   	__asm__ ("" : "=rm" (var) : "0" (var))
> +
> +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
> +
>   #endif
>   
>   /* Not-quite-unique ID. */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  3:58     ` Jason Wang
  0 siblings, 0 replies; 94+ messages in thread
From: Jason Wang @ 2019-01-07  3:58 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-alpha,
	Luc Van Oostenryck


On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> It's not uncommon to have two access two unrelated memory locations in a
> specific order.  At the moment one has to use a memory barrier for this.
>
> However, if the first access was a read and the second used an address
> depending on the first one we would have a data dependency and no
> barrier would be necessary.
>
> This adds a new interface: dependent_ptr_mb which does exactly this: it
> returns a pointer with a data dependency on the supplied value.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>   Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>   arch/alpha/include/asm/barrier.h  |  1 +
>   include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>   include/linux/compiler.h          |  4 ++++
>   4 files changed, 43 insertions(+)
>
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index c1d913944ad8..9dbaa2e1dbf6 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -691,6 +691,18 @@ case what's actually required is:
>   		p = READ_ONCE(b);
>   	}
>   
> +Alternatively, a control dependency can be converted to a data dependency,
> +e.g.:
> +
> +	q = READ_ONCE(a);
> +	if (q) {
> +		b = dependent_ptr_mb(b, q);
> +		p = READ_ONCE(b);
> +	}
> +
> +Note how the result of dependent_ptr_mb must be used with the following
> +accesses in order to have an effect.
> +
>   However, stores are not speculated.  This means that ordering -is- provided
>   for load-store control dependencies, as in the following example:
>   
> @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
>   the compiler to actually emit code for a given load, it does not force
>   the compiler to use the results.
>   
> +Converting to a data dependency helps with this too:
> +
> +	q = READ_ONCE(a);
> +	b = dependent_ptr_mb(b, q);
> +	WRITE_ONCE(b, 1);
> +
>   In addition, control dependencies apply only to the then-clause and
>   else-clause of the if-statement in question.  In particular, it does
>   not necessarily apply to code following the if-statement:
> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>   for more information.
>   
>   
> +
> +
>   In summary:
>   
>     (*) Control dependencies can order prior loads against later stores.
> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> index 92ec486a4f9e..b4934e8c551b 100644
> --- a/arch/alpha/include/asm/barrier.h
> +++ b/arch/alpha/include/asm/barrier.h
> @@ -59,6 +59,7 @@
>    * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
>    * in cases like this where there are no data dependencies.
>    */
> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
>   #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
>   
>   #ifdef CONFIG_SMP
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index 2cafdbb9ae4c..fa2e2ef72b68 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -70,6 +70,24 @@
>   #define __smp_read_barrier_depends()	read_barrier_depends()
>   #endif
>   
> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> +
> +#define dependent_ptr_mb(ptr, val) ({					\
> +	long dependent_ptr_mb_val = (long)(val);			\
> +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> +									\
> +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> +})
> +
> +#else
> +
> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })


So for the example of patch 4, we'd better fall back to rmb() or need a 
dependent_ptr_rmb()?

Thanks


> +
> +#endif
> +
>   #ifdef CONFIG_SMP
>   
>   #ifndef smp_mb
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 6601d39e8c48..f599c30f1b28 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>   #endif
>   
>   #ifndef OPTIMIZER_HIDE_VAR
> +
>   /* Make the optimizer believe the variable can be manipulated arbitrarily. */
>   #define OPTIMIZER_HIDE_VAR(var)						\
>   	__asm__ ("" : "=rm" (var) : "0" (var))
> +
> +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
> +
>   #endif
>   
>   /* Not-quite-unique ID. */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07  3:58     ` Jason Wang
  (?)
  (?)
@ 2019-01-07  4:23       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07  4:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> 
> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > It's not uncommon to have two access two unrelated memory locations in a
> > specific order.  At the moment one has to use a memory barrier for this.
> > 
> > However, if the first access was a read and the second used an address
> > depending on the first one we would have a data dependency and no
> > barrier would be necessary.
> > 
> > This adds a new interface: dependent_ptr_mb which does exactly this: it
> > returns a pointer with a data dependency on the supplied value.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
> >   arch/alpha/include/asm/barrier.h  |  1 +
> >   include/asm-generic/barrier.h     | 18 ++++++++++++++++++
> >   include/linux/compiler.h          |  4 ++++
> >   4 files changed, 43 insertions(+)
> > 
> > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> > index c1d913944ad8..9dbaa2e1dbf6 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -691,6 +691,18 @@ case what's actually required is:
> >   		p = READ_ONCE(b);
> >   	}
> > +Alternatively, a control dependency can be converted to a data dependency,
> > +e.g.:
> > +
> > +	q = READ_ONCE(a);
> > +	if (q) {
> > +		b = dependent_ptr_mb(b, q);
> > +		p = READ_ONCE(b);
> > +	}
> > +
> > +Note how the result of dependent_ptr_mb must be used with the following
> > +accesses in order to have an effect.
> > +
> >   However, stores are not speculated.  This means that ordering -is- provided
> >   for load-store control dependencies, as in the following example:
> > @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
> >   the compiler to actually emit code for a given load, it does not force
> >   the compiler to use the results.
> > +Converting to a data dependency helps with this too:
> > +
> > +	q = READ_ONCE(a);
> > +	b = dependent_ptr_mb(b, q);
> > +	WRITE_ONCE(b, 1);
> > +
> >   In addition, control dependencies apply only to the then-clause and
> >   else-clause of the if-statement in question.  In particular, it does
> >   not necessarily apply to code following the if-statement:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >   for more information.
> > +
> > +
> >   In summary:
> >     (*) Control dependencies can order prior loads against later stores.
> > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> > index 92ec486a4f9e..b4934e8c551b 100644
> > --- a/arch/alpha/include/asm/barrier.h
> > +++ b/arch/alpha/include/asm/barrier.h
> > @@ -59,6 +59,7 @@
> >    * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
> >    * in cases like this where there are no data dependencies.
> >    */
> > +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
> >   #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
> >   #ifdef CONFIG_SMP
> > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> > index 2cafdbb9ae4c..fa2e2ef72b68 100644
> > --- a/include/asm-generic/barrier.h
> > +++ b/include/asm-generic/barrier.h
> > @@ -70,6 +70,24 @@
> >   #define __smp_read_barrier_depends()	read_barrier_depends()
> >   #endif
> > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > +
> > +#define dependent_ptr_mb(ptr, val) ({					\
> > +	long dependent_ptr_mb_val = (long)(val);			\
> > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > +									\
> > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > +})
> > +
> > +#else
> > +
> > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> 
> 
> So for the example of patch 4, we'd better fall back to rmb() or need a
> dependent_ptr_rmb()?
> 
> Thanks

You mean for strongly ordered architectures like Intel?
Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.

mb variant is unused right now so I'll remove it.


> 
> > +
> > +#endif
> > +
> >   #ifdef CONFIG_SMP
> >   #ifndef smp_mb
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 6601d39e8c48..f599c30f1b28 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >   #endif
> >   #ifndef OPTIMIZER_HIDE_VAR
> > +
> >   /* Make the optimizer believe the variable can be manipulated arbitrarily. */
> >   #define OPTIMIZER_HIDE_VAR(var)						\
> >   	__asm__ ("" : "=rm" (var) : "0" (var))
> > +
> > +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
> > +
> >   #endif
> >   /* Not-quite-unique ID. */

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  4:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07  4:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha, Luc Van Oostenryck

On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> 
> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > It's not uncommon to have two access two unrelated memory locations in a
> > specific order.  At the moment one has to use a memory barrier for this.
> > 
> > However, if the first access was a read and the second used an address
> > depending on the first one we would have a data dependency and no
> > barrier would be necessary.
> > 
> > This adds a new interface: dependent_ptr_mb which does exactly this: it
> > returns a pointer with a data dependency on the supplied value.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
> >   arch/alpha/include/asm/barrier.h  |  1 +
> >   include/asm-generic/barrier.h     | 18 ++++++++++++++++++
> >   include/linux/compiler.h          |  4 ++++
> >   4 files changed, 43 insertions(+)
> > 
> > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> > index c1d913944ad8..9dbaa2e1dbf6 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -691,6 +691,18 @@ case what's actually required is:
> >   		p = READ_ONCE(b);
> >   	}
> > +Alternatively, a control dependency can be converted to a data dependency,
> > +e.g.:
> > +
> > +	q = READ_ONCE(a);
> > +	if (q) {
> > +		b = dependent_ptr_mb(b, q);
> > +		p = READ_ONCE(b);
> > +	}
> > +
> > +Note how the result of dependent_ptr_mb must be used with the following
> > +accesses in order to have an effect.
> > +
> >   However, stores are not speculated.  This means that ordering -is- provided
> >   for load-store control dependencies, as in the following example:
> > @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
> >   the compiler to actually emit code for a given load, it does not force
> >   the compiler to use the results.
> > +Converting to a data dependency helps with this too:
> > +
> > +	q = READ_ONCE(a);
> > +	b = dependent_ptr_mb(b, q);
> > +	WRITE_ONCE(b, 1);
> > +
> >   In addition, control dependencies apply only to the then-clause and
> >   else-clause of the if-statement in question.  In particular, it does
> >   not necessarily apply to code following the if-statement:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >   for more information.
> > +
> > +
> >   In summary:
> >     (*) Control dependencies can order prior loads against later stores.
> > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> > index 92ec486a4f9e..b4934e8c551b 100644
> > --- a/arch/alpha/include/asm/barrier.h
> > +++ b/arch/alpha/include/asm/barrier.h
> > @@ -59,6 +59,7 @@
> >    * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
> >    * in cases like this where there are no data dependencies.
> >    */
> > +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
> >   #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
> >   #ifdef CONFIG_SMP
> > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> > index 2cafdbb9ae4c..fa2e2ef72b68 100644
> > --- a/include/asm-generic/barrier.h
> > +++ b/include/asm-generic/barrier.h
> > @@ -70,6 +70,24 @@
> >   #define __smp_read_barrier_depends()	read_barrier_depends()
> >   #endif
> > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > +
> > +#define dependent_ptr_mb(ptr, val) ({					\
> > +	long dependent_ptr_mb_val = (long)(val);			\
> > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > +									\
> > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > +})
> > +
> > +#else
> > +
> > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> 
> 
> So for the example of patch 4, we'd better fall back to rmb() or need a
> dependent_ptr_rmb()?
> 
> Thanks

You mean for strongly ordered architectures like Intel?
Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.

mb variant is unused right now so I'll remove it.


> 
> > +
> > +#endif
> > +
> >   #ifdef CONFIG_SMP
> >   #ifndef smp_mb
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 6601d39e8c48..f599c30f1b28 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >   #endif
> >   #ifndef OPTIMIZER_HIDE_VAR
> > +
> >   /* Make the optimizer believe the variable can be manipulated arbitrarily. */
> >   #define OPTIMIZER_HIDE_VAR(var)						\
> >   	__asm__ ("" : "=rm" (var) : "0" (var))
> > +
> > +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
> > +
> >   #endif
> >   /* Not-quite-unique ID. */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  4:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07  4:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha

On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> 
> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > It's not uncommon to have two access two unrelated memory locations in a
> > specific order.  At the moment one has to use a memory barrier for this.
> > 
> > However, if the first access was a read and the second used an address
> > depending on the first one we would have a data dependency and no
> > barrier would be necessary.
> > 
> > This adds a new interface: dependent_ptr_mb which does exactly this: it
> > returns a pointer with a data dependency on the supplied value.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
> >   arch/alpha/include/asm/barrier.h  |  1 +
> >   include/asm-generic/barrier.h     | 18 ++++++++++++++++++
> >   include/linux/compiler.h          |  4 ++++
> >   4 files changed, 43 insertions(+)
> > 
> > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> > index c1d913944ad8..9dbaa2e1dbf6 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -691,6 +691,18 @@ case what's actually required is:
> >   		p = READ_ONCE(b);
> >   	}
> > +Alternatively, a control dependency can be converted to a data dependency,
> > +e.g.:
> > +
> > +	q = READ_ONCE(a);
> > +	if (q) {
> > +		b = dependent_ptr_mb(b, q);
> > +		p = READ_ONCE(b);
> > +	}
> > +
> > +Note how the result of dependent_ptr_mb must be used with the following
> > +accesses in order to have an effect.
> > +
> >   However, stores are not speculated.  This means that ordering -is- provided
> >   for load-store control dependencies, as in the following example:
> > @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
> >   the compiler to actually emit code for a given load, it does not force
> >   the compiler to use the results.
> > +Converting to a data dependency helps with this too:
> > +
> > +	q = READ_ONCE(a);
> > +	b = dependent_ptr_mb(b, q);
> > +	WRITE_ONCE(b, 1);
> > +
> >   In addition, control dependencies apply only to the then-clause and
> >   else-clause of the if-statement in question.  In particular, it does
> >   not necessarily apply to code following the if-statement:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >   for more information.
> > +
> > +
> >   In summary:
> >     (*) Control dependencies can order prior loads against later stores.
> > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> > index 92ec486a4f9e..b4934e8c551b 100644
> > --- a/arch/alpha/include/asm/barrier.h
> > +++ b/arch/alpha/include/asm/barrier.h
> > @@ -59,6 +59,7 @@
> >    * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
> >    * in cases like this where there are no data dependencies.
> >    */
> > +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
> >   #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
> >   #ifdef CONFIG_SMP
> > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> > index 2cafdbb9ae4c..fa2e2ef72b68 100644
> > --- a/include/asm-generic/barrier.h
> > +++ b/include/asm-generic/barrier.h
> > @@ -70,6 +70,24 @@
> >   #define __smp_read_barrier_depends()	read_barrier_depends()
> >   #endif
> > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > +
> > +#define dependent_ptr_mb(ptr, val) ({					\
> > +	long dependent_ptr_mb_val = (long)(val);			\
> > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > +									\
> > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > +})
> > +
> > +#else
> > +
> > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> 
> 
> So for the example of patch 4, we'd better fall back to rmb() or need a
> dependent_ptr_rmb()?
> 
> Thanks

You mean for strongly ordered architectures like Intel?
Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.

mb variant is unused right now so I'll remove it.


> 
> > +
> > +#endif
> > +
> >   #ifdef CONFIG_SMP
> >   #ifndef smp_mb
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 6601d39e8c48..f599c30f1b28 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >   #endif
> >   #ifndef OPTIMIZER_HIDE_VAR
> > +
> >   /* Make the optimizer believe the variable can be manipulated arbitrarily. */
> >   #define OPTIMIZER_HIDE_VAR(var)						\
> >   	__asm__ ("" : "=rm" (var) : "0" (var))
> > +
> > +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
> > +
> >   #endif
> >   /* Not-quite-unique ID. */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  4:23       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07  4:23 UTC (permalink / raw)
  To: Jason Wang
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha, Luc Van Oostenryck

On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> 
> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > It's not uncommon to have two access two unrelated memory locations in a
> > specific order.  At the moment one has to use a memory barrier for this.
> > 
> > However, if the first access was a read and the second used an address
> > depending on the first one we would have a data dependency and no
> > barrier would be necessary.
> > 
> > This adds a new interface: dependent_ptr_mb which does exactly this: it
> > returns a pointer with a data dependency on the supplied value.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >   Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
> >   arch/alpha/include/asm/barrier.h  |  1 +
> >   include/asm-generic/barrier.h     | 18 ++++++++++++++++++
> >   include/linux/compiler.h          |  4 ++++
> >   4 files changed, 43 insertions(+)
> > 
> > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> > index c1d913944ad8..9dbaa2e1dbf6 100644
> > --- a/Documentation/memory-barriers.txt
> > +++ b/Documentation/memory-barriers.txt
> > @@ -691,6 +691,18 @@ case what's actually required is:
> >   		p = READ_ONCE(b);
> >   	}
> > +Alternatively, a control dependency can be converted to a data dependency,
> > +e.g.:
> > +
> > +	q = READ_ONCE(a);
> > +	if (q) {
> > +		b = dependent_ptr_mb(b, q);
> > +		p = READ_ONCE(b);
> > +	}
> > +
> > +Note how the result of dependent_ptr_mb must be used with the following
> > +accesses in order to have an effect.
> > +
> >   However, stores are not speculated.  This means that ordering -is- provided
> >   for load-store control dependencies, as in the following example:
> > @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
> >   the compiler to actually emit code for a given load, it does not force
> >   the compiler to use the results.
> > +Converting to a data dependency helps with this too:
> > +
> > +	q = READ_ONCE(a);
> > +	b = dependent_ptr_mb(b, q);
> > +	WRITE_ONCE(b, 1);
> > +
> >   In addition, control dependencies apply only to the then-clause and
> >   else-clause of the if-statement in question.  In particular, it does
> >   not necessarily apply to code following the if-statement:
> > @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
> >   for more information.
> > +
> > +
> >   In summary:
> >     (*) Control dependencies can order prior loads against later stores.
> > diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
> > index 92ec486a4f9e..b4934e8c551b 100644
> > --- a/arch/alpha/include/asm/barrier.h
> > +++ b/arch/alpha/include/asm/barrier.h
> > @@ -59,6 +59,7 @@
> >    * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
> >    * in cases like this where there are no data dependencies.
> >    */
> > +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
> >   #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
> >   #ifdef CONFIG_SMP
> > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> > index 2cafdbb9ae4c..fa2e2ef72b68 100644
> > --- a/include/asm-generic/barrier.h
> > +++ b/include/asm-generic/barrier.h
> > @@ -70,6 +70,24 @@
> >   #define __smp_read_barrier_depends()	read_barrier_depends()
> >   #endif
> > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > +
> > +#define dependent_ptr_mb(ptr, val) ({					\
> > +	long dependent_ptr_mb_val = (long)(val);			\
> > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > +									\
> > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > +})
> > +
> > +#else
> > +
> > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> 
> 
> So for the example of patch 4, we'd better fall back to rmb() or need a
> dependent_ptr_rmb()?
> 
> Thanks

You mean for strongly ordered architectures like Intel?
Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.

mb variant is unused right now so I'll remove it.


> 
> > +
> > +#endif
> > +
> >   #ifdef CONFIG_SMP
> >   #ifndef smp_mb
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 6601d39e8c48..f599c30f1b28 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,9 +152,13 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >   #endif
> >   #ifndef OPTIMIZER_HIDE_VAR
> > +
> >   /* Make the optimizer believe the variable can be manipulated arbitrarily. */
> >   #define OPTIMIZER_HIDE_VAR(var)						\
> >   	__asm__ ("" : "=rm" (var) : "0" (var))
> > +
> > +#define COMPILER_HAS_OPTIMIZER_HIDE_VAR 1
> > +
> >   #endif
> >   /* Not-quite-unique ID. */
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07  4:23       ` Michael S. Tsirkin
  (?)
  (?)
@ 2019-01-07  6:50         ` Jason Wang
  -1 siblings, 0 replies; 94+ messages in thread
From: Jason Wang @ 2019-01-07  6:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse


On 2019/1/7 下午12:23, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
>> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
>>> It's not uncommon to have two access two unrelated memory locations in a
>>> specific order.  At the moment one has to use a memory barrier for this.
>>>
>>> However, if the first access was a read and the second used an address
>>> depending on the first one we would have a data dependency and no
>>> barrier would be necessary.
>>>
>>> This adds a new interface: dependent_ptr_mb which does exactly this: it
>>> returns a pointer with a data dependency on the supplied value.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>> ---
>>>    Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>>>    arch/alpha/include/asm/barrier.h  |  1 +
>>>    include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>>>    include/linux/compiler.h          |  4 ++++
>>>    4 files changed, 43 insertions(+)
>>>
>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>> index c1d913944ad8..9dbaa2e1dbf6 100644
>>> --- a/Documentation/memory-barriers.txt
>>> +++ b/Documentation/memory-barriers.txt
>>> @@ -691,6 +691,18 @@ case what's actually required is:
>>>    		p = READ_ONCE(b);
>>>    	}
>>> +Alternatively, a control dependency can be converted to a data dependency,
>>> +e.g.:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	if (q) {
>>> +		b = dependent_ptr_mb(b, q);
>>> +		p = READ_ONCE(b);
>>> +	}
>>> +
>>> +Note how the result of dependent_ptr_mb must be used with the following
>>> +accesses in order to have an effect.
>>> +
>>>    However, stores are not speculated.  This means that ordering -is- provided
>>>    for load-store control dependencies, as in the following example:
>>> @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>    the compiler to actually emit code for a given load, it does not force
>>>    the compiler to use the results.
>>> +Converting to a data dependency helps with this too:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	b = dependent_ptr_mb(b, q);
>>> +	WRITE_ONCE(b, 1);
>>> +
>>>    In addition, control dependencies apply only to the then-clause and
>>>    else-clause of the if-statement in question.  In particular, it does
>>>    not necessarily apply to code following the if-statement:
>>> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>>>    for more information.
>>> +
>>> +
>>>    In summary:
>>>      (*) Control dependencies can order prior loads against later stores.
>>> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
>>> index 92ec486a4f9e..b4934e8c551b 100644
>>> --- a/arch/alpha/include/asm/barrier.h
>>> +++ b/arch/alpha/include/asm/barrier.h
>>> @@ -59,6 +59,7 @@
>>>     * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
>>>     * in cases like this where there are no data dependencies.
>>>     */
>>> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
>>>    #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
>>>    #ifdef CONFIG_SMP
>>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>>> index 2cafdbb9ae4c..fa2e2ef72b68 100644
>>> --- a/include/asm-generic/barrier.h
>>> +++ b/include/asm-generic/barrier.h
>>> @@ -70,6 +70,24 @@
>>>    #define __smp_read_barrier_depends()	read_barrier_depends()
>>>    #endif
>>> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
>>> +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({					\
>>> +	long dependent_ptr_mb_val = (long)(val);			\
>>> +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
>>> +									\
>>> +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
>>> +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
>>> +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
>>> +})
>>> +
>>> +#else
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
>> So for the example of patch 4, we'd better fall back to rmb() or need a
>> dependent_ptr_rmb()?
>>
>> Thanks
> You mean for strongly ordered architectures like Intel?
> Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
>
> mb variant is unused right now so I'll remove it.
>
>

Yes.

Thanks



^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  6:50         ` Jason Wang
  0 siblings, 0 replies; 94+ messages in thread
From: Jason Wang @ 2019-01-07  6:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha, Luc Van Oostenryck


On 2019/1/7 下午12:23, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
>> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
>>> It's not uncommon to have two access two unrelated memory locations in a
>>> specific order.  At the moment one has to use a memory barrier for this.
>>>
>>> However, if the first access was a read and the second used an address
>>> depending on the first one we would have a data dependency and no
>>> barrier would be necessary.
>>>
>>> This adds a new interface: dependent_ptr_mb which does exactly this: it
>>> returns a pointer with a data dependency on the supplied value.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>> ---
>>>    Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>>>    arch/alpha/include/asm/barrier.h  |  1 +
>>>    include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>>>    include/linux/compiler.h          |  4 ++++
>>>    4 files changed, 43 insertions(+)
>>>
>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>> index c1d913944ad8..9dbaa2e1dbf6 100644
>>> --- a/Documentation/memory-barriers.txt
>>> +++ b/Documentation/memory-barriers.txt
>>> @@ -691,6 +691,18 @@ case what's actually required is:
>>>    		p = READ_ONCE(b);
>>>    	}
>>> +Alternatively, a control dependency can be converted to a data dependency,
>>> +e.g.:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	if (q) {
>>> +		b = dependent_ptr_mb(b, q);
>>> +		p = READ_ONCE(b);
>>> +	}
>>> +
>>> +Note how the result of dependent_ptr_mb must be used with the following
>>> +accesses in order to have an effect.
>>> +
>>>    However, stores are not speculated.  This means that ordering -is- provided
>>>    for load-store control dependencies, as in the following example:
>>> @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>    the compiler to actually emit code for a given load, it does not force
>>>    the compiler to use the results.
>>> +Converting to a data dependency helps with this too:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	b = dependent_ptr_mb(b, q);
>>> +	WRITE_ONCE(b, 1);
>>> +
>>>    In addition, control dependencies apply only to the then-clause and
>>>    else-clause of the if-statement in question.  In particular, it does
>>>    not necessarily apply to code following the if-statement:
>>> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>>>    for more information.
>>> +
>>> +
>>>    In summary:
>>>      (*) Control dependencies can order prior loads against later stores.
>>> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
>>> index 92ec486a4f9e..b4934e8c551b 100644
>>> --- a/arch/alpha/include/asm/barrier.h
>>> +++ b/arch/alpha/include/asm/barrier.h
>>> @@ -59,6 +59,7 @@
>>>     * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
>>>     * in cases like this where there are no data dependencies.
>>>     */
>>> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
>>>    #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
>>>    #ifdef CONFIG_SMP
>>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>>> index 2cafdbb9ae4c..fa2e2ef72b68 100644
>>> --- a/include/asm-generic/barrier.h
>>> +++ b/include/asm-generic/barrier.h
>>> @@ -70,6 +70,24 @@
>>>    #define __smp_read_barrier_depends()	read_barrier_depends()
>>>    #endif
>>> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
>>> +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({					\
>>> +	long dependent_ptr_mb_val = (long)(val);			\
>>> +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
>>> +									\
>>> +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
>>> +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
>>> +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
>>> +})
>>> +
>>> +#else
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
>> So for the example of patch 4, we'd better fall back to rmb() or need a
>> dependent_ptr_rmb()?
>>
>> Thanks
> You mean for strongly ordered architectures like Intel?
> Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
>
> mb variant is unused right now so I'll remove it.
>
>

Yes.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  6:50         ` Jason Wang
  0 siblings, 0 replies; 94+ messages in thread
From: Jason Wang @ 2019-01-07  6:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha


On 2019/1/7 下午12:23, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
>> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
>>> It's not uncommon to have two access two unrelated memory locations in a
>>> specific order.  At the moment one has to use a memory barrier for this.
>>>
>>> However, if the first access was a read and the second used an address
>>> depending on the first one we would have a data dependency and no
>>> barrier would be necessary.
>>>
>>> This adds a new interface: dependent_ptr_mb which does exactly this: it
>>> returns a pointer with a data dependency on the supplied value.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>> ---
>>>    Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>>>    arch/alpha/include/asm/barrier.h  |  1 +
>>>    include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>>>    include/linux/compiler.h          |  4 ++++
>>>    4 files changed, 43 insertions(+)
>>>
>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>> index c1d913944ad8..9dbaa2e1dbf6 100644
>>> --- a/Documentation/memory-barriers.txt
>>> +++ b/Documentation/memory-barriers.txt
>>> @@ -691,6 +691,18 @@ case what's actually required is:
>>>    		p = READ_ONCE(b);
>>>    	}
>>> +Alternatively, a control dependency can be converted to a data dependency,
>>> +e.g.:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	if (q) {
>>> +		b = dependent_ptr_mb(b, q);
>>> +		p = READ_ONCE(b);
>>> +	}
>>> +
>>> +Note how the result of dependent_ptr_mb must be used with the following
>>> +accesses in order to have an effect.
>>> +
>>>    However, stores are not speculated.  This means that ordering -is- provided
>>>    for load-store control dependencies, as in the following example:
>>> @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>    the compiler to actually emit code for a given load, it does not force
>>>    the compiler to use the results.
>>> +Converting to a data dependency helps with this too:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	b = dependent_ptr_mb(b, q);
>>> +	WRITE_ONCE(b, 1);
>>> +
>>>    In addition, control dependencies apply only to the then-clause and
>>>    else-clause of the if-statement in question.  In particular, it does
>>>    not necessarily apply to code following the if-statement:
>>> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>>>    for more information.
>>> +
>>> +
>>>    In summary:
>>>      (*) Control dependencies can order prior loads against later stores.
>>> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
>>> index 92ec486a4f9e..b4934e8c551b 100644
>>> --- a/arch/alpha/include/asm/barrier.h
>>> +++ b/arch/alpha/include/asm/barrier.h
>>> @@ -59,6 +59,7 @@
>>>     * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
>>>     * in cases like this where there are no data dependencies.
>>>     */
>>> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
>>>    #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
>>>    #ifdef CONFIG_SMP
>>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>>> index 2cafdbb9ae4c..fa2e2ef72b68 100644
>>> --- a/include/asm-generic/barrier.h
>>> +++ b/include/asm-generic/barrier.h
>>> @@ -70,6 +70,24 @@
>>>    #define __smp_read_barrier_depends()	read_barrier_depends()
>>>    #endif
>>> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
>>> +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({					\
>>> +	long dependent_ptr_mb_val = (long)(val);			\
>>> +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
>>> +									\
>>> +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
>>> +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
>>> +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
>>> +})
>>> +
>>> +#else
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
>> So for the example of patch 4, we'd better fall back to rmb() or need a
>> dependent_ptr_rmb()?
>>
>> Thanks
> You mean for strongly ordered architectures like Intel?
> Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
>
> mb variant is unused right now so I'll remove it.
>
>

Yes.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  6:50         ` Jason Wang
  0 siblings, 0 replies; 94+ messages in thread
From: Jason Wang @ 2019-01-07  6:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Paul E. McKenney, Daniel Lustig, Arnd Bergmann, Boqun Feng,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, Jade Alglave, netdev, linux-kernel,
	linux-alpha, Luc Van Oostenryck


On 2019/1/7 下午12:23, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
>> On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
>>> It's not uncommon to have two access two unrelated memory locations in a
>>> specific order.  At the moment one has to use a memory barrier for this.
>>>
>>> However, if the first access was a read and the second used an address
>>> depending on the first one we would have a data dependency and no
>>> barrier would be necessary.
>>>
>>> This adds a new interface: dependent_ptr_mb which does exactly this: it
>>> returns a pointer with a data dependency on the supplied value.
>>>
>>> Signed-off-by: Michael S. Tsirkin<mst@redhat.com>
>>> ---
>>>    Documentation/memory-barriers.txt | 20 ++++++++++++++++++++
>>>    arch/alpha/include/asm/barrier.h  |  1 +
>>>    include/asm-generic/barrier.h     | 18 ++++++++++++++++++
>>>    include/linux/compiler.h          |  4 ++++
>>>    4 files changed, 43 insertions(+)
>>>
>>> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
>>> index c1d913944ad8..9dbaa2e1dbf6 100644
>>> --- a/Documentation/memory-barriers.txt
>>> +++ b/Documentation/memory-barriers.txt
>>> @@ -691,6 +691,18 @@ case what's actually required is:
>>>    		p = READ_ONCE(b);
>>>    	}
>>> +Alternatively, a control dependency can be converted to a data dependency,
>>> +e.g.:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	if (q) {
>>> +		b = dependent_ptr_mb(b, q);
>>> +		p = READ_ONCE(b);
>>> +	}
>>> +
>>> +Note how the result of dependent_ptr_mb must be used with the following
>>> +accesses in order to have an effect.
>>> +
>>>    However, stores are not speculated.  This means that ordering -is- provided
>>>    for load-store control dependencies, as in the following example:
>>> @@ -836,6 +848,12 @@ out-guess your code.  More generally, although READ_ONCE() does force
>>>    the compiler to actually emit code for a given load, it does not force
>>>    the compiler to use the results.
>>> +Converting to a data dependency helps with this too:
>>> +
>>> +	q = READ_ONCE(a);
>>> +	b = dependent_ptr_mb(b, q);
>>> +	WRITE_ONCE(b, 1);
>>> +
>>>    In addition, control dependencies apply only to the then-clause and
>>>    else-clause of the if-statement in question.  In particular, it does
>>>    not necessarily apply to code following the if-statement:
>>> @@ -875,6 +893,8 @@ to the CPU containing it.  See the section on "Multicopy atomicity"
>>>    for more information.
>>> +
>>> +
>>>    In summary:
>>>      (*) Control dependencies can order prior loads against later stores.
>>> diff --git a/arch/alpha/include/asm/barrier.h b/arch/alpha/include/asm/barrier.h
>>> index 92ec486a4f9e..b4934e8c551b 100644
>>> --- a/arch/alpha/include/asm/barrier.h
>>> +++ b/arch/alpha/include/asm/barrier.h
>>> @@ -59,6 +59,7 @@
>>>     * as Alpha, "y" could be set to 3 and "x" to 0.  Use rmb()
>>>     * in cases like this where there are no data dependencies.
>>>     */
>>> +#define ARCH_NEEDS_READ_BARRIER_DEPENDS 1
>>>    #define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
>>>    #ifdef CONFIG_SMP
>>> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
>>> index 2cafdbb9ae4c..fa2e2ef72b68 100644
>>> --- a/include/asm-generic/barrier.h
>>> +++ b/include/asm-generic/barrier.h
>>> @@ -70,6 +70,24 @@
>>>    #define __smp_read_barrier_depends()	read_barrier_depends()
>>>    #endif
>>> +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
>>> +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({					\
>>> +	long dependent_ptr_mb_val = (long)(val);			\
>>> +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
>>> +									\
>>> +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
>>> +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
>>> +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
>>> +})
>>> +
>>> +#else
>>> +
>>> +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
>> So for the example of patch 4, we'd better fall back to rmb() or need a
>> dependent_ptr_rmb()?
>>
>> Thanks
> You mean for strongly ordered architectures like Intel?
> Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
>
> mb variant is unused right now so I'll remove it.
>
>

Yes.

Thanks


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07  4:23       ` Michael S. Tsirkin
@ 2019-01-07  9:46         ` Peter Zijlstra
  -1 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2019-01-07  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann,
	Luc Van Oostenryck, linux-doc, linux-alpha, linux-sparse

On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:

> > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > +
> > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > +									\
> > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > +})
> > > +
> > > +#else
> > > +
> > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > 
> > 
> > So for the example of patch 4, we'd better fall back to rmb() or need a
> > dependent_ptr_rmb()?
> > 
> > Thanks
> 
> You mean for strongly ordered architectures like Intel?
> Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> 
> mb variant is unused right now so I'll remove it.

How about naming the thing: dependent_ptr() ? That is without any (r)mb
implications at all. The address dependency is strictly weaker than an
rmb in that it will only order the two loads in qestion and not, like
rmb, any prior to any later load.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07  9:46         ` Peter Zijlstra
  0 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2019-01-07  9:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-doc, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, Jonathan Corbet,
	linux-sparse, Alan Stern, Matt Turner, Paul E. McKenney,
	Boqun Feng, Arnd Bergmann, Daniel Lustig, Nicholas Piggin,
	Ivan Kokshaysky, Luc Maranget, Richard Henderson, Jade Alglave,
	netdev, linux-kernel, linux-alpha, Luc Van Oostenryck

On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:

> > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > +
> > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > +									\
> > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > +})
> > > +
> > > +#else
> > > +
> > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > 
> > 
> > So for the example of patch 4, we'd better fall back to rmb() or need a
> > dependent_ptr_rmb()?
> > 
> > Thanks
> 
> You mean for strongly ordered architectures like Intel?
> Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> 
> mb variant is unused right now so I'll remove it.

How about naming the thing: dependent_ptr() ? That is without any (r)mb
implications at all. The address dependency is strictly weaker than an
rmb in that it will only order the two loads in qestion and not, like
rmb, any prior to any later load.
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07  9:46         ` Peter Zijlstra
@ 2019-01-07 13:36           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 13:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jason Wang, linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann,
	Luc Van Oostenryck, linux-doc, linux-alpha, linux-sparse

On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> 
> > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > +
> > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > +									\
> > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > +})
> > > > +
> > > > +#else
> > > > +
> > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > 
> > > 
> > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > dependent_ptr_rmb()?
> > > 
> > > Thanks
> > 
> > You mean for strongly ordered architectures like Intel?
> > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > 
> > mb variant is unused right now so I'll remove it.
> 
> How about naming the thing: dependent_ptr() ? That is without any (r)mb
> implications at all. The address dependency is strictly weaker than an
> rmb in that it will only order the two loads in qestion and not, like
> rmb, any prior to any later load.

So I'm fine with this as it's enough for virtio, but I would like to point out two things:

1. E.g. on x86 both SMP and DMA variants can be NOPs but
the madatory one can't, so assuming we do not want
it to be stronger than rmp then either we want
smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
or we just will specify that dependent_ptr() works for
both DMA and SMP.

2. Down the road, someone might want to order a store after a load.
Address dependency does that for us too. Assuming we make
dependent_ptr a NOP on x86, we will want an mb variant
which isn't a NOP on x86. Will we want to rename
dependent_ptr to dependent_ptr_rmb at that point?

Thanks,

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 13:36           ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 13:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Parri, linux-doc, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, Jonathan Corbet,
	linux-sparse, Alan Stern, Matt Turner, Paul E. McKenney,
	Boqun Feng, Arnd Bergmann, Daniel Lustig, Nicholas Piggin,
	Ivan Kokshaysky, Luc Maranget, Richard Henderson, Jade Alglave,
	netdev, linux-kernel, linux-alpha, Luc Van Oostenryck

On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> 
> > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > +
> > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > +									\
> > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > +})
> > > > +
> > > > +#else
> > > > +
> > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > 
> > > 
> > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > dependent_ptr_rmb()?
> > > 
> > > Thanks
> > 
> > You mean for strongly ordered architectures like Intel?
> > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > 
> > mb variant is unused right now so I'll remove it.
> 
> How about naming the thing: dependent_ptr() ? That is without any (r)mb
> implications at all. The address dependency is strictly weaker than an
> rmb in that it will only order the two loads in qestion and not, like
> rmb, any prior to any later load.

So I'm fine with this as it's enough for virtio, but I would like to point out two things:

1. E.g. on x86 both SMP and DMA variants can be NOPs but
the madatory one can't, so assuming we do not want
it to be stronger than rmp then either we want
smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
or we just will specify that dependent_ptr() works for
both DMA and SMP.

2. Down the road, someone might want to order a store after a load.
Address dependency does that for us too. Assuming we make
dependent_ptr a NOP on x86, we will want an mb variant
which isn't a NOP on x86. Will we want to rename
dependent_ptr to dependent_ptr_rmb at that point?

Thanks,

-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 13:36           ` Michael S. Tsirkin
@ 2019-01-07 15:54             ` Peter Zijlstra
  -1 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2019-01-07 15:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann,
	Luc Van Oostenryck, linux-doc, linux-alpha, linux-sparse

On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:

> > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > implications at all. The address dependency is strictly weaker than an
> > rmb in that it will only order the two loads in qestion and not, like
> > rmb, any prior to any later load.
> 
> So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> 
> 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> the madatory one can't, so assuming we do not want
> it to be stronger than rmp then either we want
> smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> or we just will specify that dependent_ptr() works for
> both DMA and SMP.

The latter; the construct simply generates dependent loads. It is up to
the CPU as to what all that works for.

> 2. Down the road, someone might want to order a store after a load.
> Address dependency does that for us too. Assuming we make
> dependent_ptr a NOP on x86, we will want an mb variant
> which isn't a NOP on x86. Will we want to rename
> dependent_ptr to dependent_ptr_rmb at that point?

Not sure; what is the actual overhead of the construct on x86 vs the
NOP?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 15:54             ` Peter Zijlstra
  0 siblings, 0 replies; 94+ messages in thread
From: Peter Zijlstra @ 2019-01-07 15:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-doc, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, Jonathan Corbet,
	linux-sparse, Alan Stern, Matt Turner, Paul E. McKenney,
	Boqun Feng, Arnd Bergmann, Daniel Lustig, Nicholas Piggin,
	Ivan Kokshaysky, Luc Maranget, Richard Henderson, Jade Alglave,
	netdev, linux-kernel, linux-alpha, Luc Van Oostenryck

On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:

> > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > implications at all. The address dependency is strictly weaker than an
> > rmb in that it will only order the two loads in qestion and not, like
> > rmb, any prior to any later load.
> 
> So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> 
> 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> the madatory one can't, so assuming we do not want
> it to be stronger than rmp then either we want
> smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> or we just will specify that dependent_ptr() works for
> both DMA and SMP.

The latter; the construct simply generates dependent loads. It is up to
the CPU as to what all that works for.

> 2. Down the road, someone might want to order a store after a load.
> Address dependency does that for us too. Assuming we make
> dependent_ptr a NOP on x86, we will want an mb variant
> which isn't a NOP on x86. Will we want to rename
> dependent_ptr to dependent_ptr_rmb at that point?

Not sure; what is the actual overhead of the construct on x86 vs the
NOP?

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 15:54             ` Peter Zijlstra
  (?)
@ 2019-01-07 16:22               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jason Wang, linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann,
	Luc Van Oostenryck, linux-doc, linux-alpha, linux-sparse

On Mon, Jan 07, 2019 at 04:54:23PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> 
> The latter; the construct simply generates dependent loads. It is up to
> the CPU as to what all that works for.

But not on intel right? On intel loads are ordered so it can be a nop.

> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> Not sure; what is the actual overhead of the construct on x86 vs the
> NOP?

I'll have to check. There's a pipeline stall almost for sure - that's
why we put it there after all :).

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 16:22               ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jason Wang, linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann

On Mon, Jan 07, 2019 at 04:54:23PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> 
> The latter; the construct simply generates dependent loads. It is up to
> the CPU as to what all that works for.

But not on intel right? On intel loads are ordered so it can be a nop.

> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> Not sure; what is the actual overhead of the construct on x86 vs the
> NOP?

I'll have to check. There's a pipeline stall almost for sure - that's
why we put it there after all :).

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 16:22               ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jason Wang, linux-kernel, Alan Stern, Andrea Parri, Will Deacon,
	Boqun Feng, Nicholas Piggin, David Howells, Jade Alglave,
	Luc Maranget, Paul E. McKenney, Akira Yokosawa, Daniel Lustig,
	linux-arch, netdev, virtualization, Jonathan Corbet,
	Richard Henderson, Ivan Kokshaysky, Matt Turner, Arnd Bergmann

On Mon, Jan 07, 2019 at 04:54:23PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> 
> The latter; the construct simply generates dependent loads. It is up to
> the CPU as to what all that works for.

But not on intel right? On intel loads are ordered so it can be a nop.

> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> Not sure; what is the actual overhead of the construct on x86 vs the
> NOP?

I'll have to check. There's a pipeline stall almost for sure - that's
why we put it there after all :).

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 15:54             ` Peter Zijlstra
  (?)
  (?)
@ 2019-01-07 16:22             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 16:22 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrea Parri, linux-doc, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, Jonathan Corbet,
	linux-sparse, Alan Stern, Matt Turner, Paul E. McKenney,
	Boqun Feng, Arnd Bergmann, Daniel Lustig, Nicholas Piggin,
	Ivan Kokshaysky, Luc Maranget, Richard Henderson, Jade Alglave,
	netdev, linux-kernel, linux-alpha, Luc Van Oostenryck

On Mon, Jan 07, 2019 at 04:54:23PM +0100, Peter Zijlstra wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> 
> The latter; the construct simply generates dependent loads. It is up to
> the CPU as to what all that works for.

But not on intel right? On intel loads are ordered so it can be a nop.

> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> Not sure; what is the actual overhead of the construct on x86 vs the
> NOP?

I'll have to check. There's a pipeline stall almost for sure - that's
why we put it there after all :).

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands
  2019-01-02 20:57 ` Michael S. Tsirkin
@ 2019-01-07 17:54     ` Will Deacon
  0 siblings, 0 replies; 94+ messages in thread
From: Will Deacon @ 2019-01-07 17:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Luc Van Oostenryck, linux-sparse

On Wed, Jan 02, 2019 at 03:57:54PM -0500, Michael S. Tsirkin wrote:
> We don't really care whether the variable is in-register
> or in-memory. Relax the constraint accordingly.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/compiler.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 1ad367b4cd8d..6601d39e8c48 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -154,7 +154,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  #ifndef OPTIMIZER_HIDE_VAR
>  /* Make the optimizer believe the variable can be manipulated arbitrarily. */
>  #define OPTIMIZER_HIDE_VAR(var)						\
> -	__asm__ ("" : "=r" (var) : "0" (var))
> +	__asm__ ("" : "=rm" (var) : "0" (var))
>  #endif

I think this can break for architectures with write-back addressing modes
such as arm, where the "m" constraint is assumed to be evaluated precisely
once in the asm block.

Will

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands
@ 2019-01-07 17:54     ` Will Deacon
  0 siblings, 0 replies; 94+ messages in thread
From: Will Deacon @ 2019-01-07 17:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, linux-arch, Paul E. McKenney, Peter Zijlstra,
	Daniel Lustig, Akira Yokosawa, linux-kernel, Nicholas Piggin,
	virtualization, David Howells, linux-sparse, Alan Stern, netdev,
	Luc Maranget, Jade Alglave, Boqun Feng, Luc Van Oostenryck

On Wed, Jan 02, 2019 at 03:57:54PM -0500, Michael S. Tsirkin wrote:
> We don't really care whether the variable is in-register
> or in-memory. Relax the constraint accordingly.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/linux/compiler.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 1ad367b4cd8d..6601d39e8c48 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -154,7 +154,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  #ifndef OPTIMIZER_HIDE_VAR
>  /* Make the optimizer believe the variable can be manipulated arbitrarily. */
>  #define OPTIMIZER_HIDE_VAR(var)						\
> -	__asm__ ("" : "=r" (var) : "0" (var))
> +	__asm__ ("" : "=rm" (var) : "0" (var))
>  #endif

I think this can break for architectures with write-back addressing modes
such as arm, where the "m" constraint is assumed to be evaluated precisely
once in the asm block.

Will

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands
  2019-01-07 17:54     ` Will Deacon
@ 2019-01-07 18:16       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 18:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Jason Wang, Alan Stern, Andrea Parri,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Luc Van Oostenryck, linux-sparse

On Mon, Jan 07, 2019 at 05:54:27PM +0000, Will Deacon wrote:
> On Wed, Jan 02, 2019 at 03:57:54PM -0500, Michael S. Tsirkin wrote:
> > We don't really care whether the variable is in-register
> > or in-memory. Relax the constraint accordingly.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/compiler.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 1ad367b4cd8d..6601d39e8c48 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -154,7 +154,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >  #ifndef OPTIMIZER_HIDE_VAR
> >  /* Make the optimizer believe the variable can be manipulated arbitrarily. */
> >  #define OPTIMIZER_HIDE_VAR(var)						\
> > -	__asm__ ("" : "=r" (var) : "0" (var))
> > +	__asm__ ("" : "=rm" (var) : "0" (var))
> >  #endif
> 
> I think this can break for architectures with write-back addressing modes
> such as arm, where the "m" constraint is assumed to be evaluated precisely
> once in the asm block.
> 
> Will

Thanks, I'll drop this patch.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands
@ 2019-01-07 18:16       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 18:16 UTC (permalink / raw)
  To: Will Deacon
  Cc: Andrea Parri, linux-arch, Paul E. McKenney, Peter Zijlstra,
	Daniel Lustig, Akira Yokosawa, linux-kernel, Nicholas Piggin,
	virtualization, David Howells, linux-sparse, Alan Stern, netdev,
	Luc Maranget, Jade Alglave, Boqun Feng, Luc Van Oostenryck

On Mon, Jan 07, 2019 at 05:54:27PM +0000, Will Deacon wrote:
> On Wed, Jan 02, 2019 at 03:57:54PM -0500, Michael S. Tsirkin wrote:
> > We don't really care whether the variable is in-register
> > or in-memory. Relax the constraint accordingly.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/linux/compiler.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 1ad367b4cd8d..6601d39e8c48 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -154,7 +154,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >  #ifndef OPTIMIZER_HIDE_VAR
> >  /* Make the optimizer believe the variable can be manipulated arbitrarily. */
> >  #define OPTIMIZER_HIDE_VAR(var)						\
> > -	__asm__ ("" : "=r" (var) : "0" (var))
> > +	__asm__ ("" : "=rm" (var) : "0" (var))
> >  #endif
> 
> I think this can break for architectures with write-back addressing modes
> such as arm, where the "m" constraint is assumed to be evaluated precisely
> once in the asm block.
> 
> Will

Thanks, I'll drop this patch.

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 13:36           ` Michael S. Tsirkin
  (?)
@ 2019-01-07 19:02             ` Paul E. McKenney
  -1 siblings, 0 replies; 94+ messages in thread
From: Paul E. McKenney @ 2019-01-07 19:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > 
> > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > +
> > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > +									\
> > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > +})
> > > > > +
> > > > > +#else
> > > > > +
> > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > 
> > > > 
> > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > dependent_ptr_rmb()?
> > > > 
> > > > Thanks
> > > 
> > > You mean for strongly ordered architectures like Intel?
> > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > 
> > > mb variant is unused right now so I'll remove it.
> > 
> > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > implications at all. The address dependency is strictly weaker than an
> > rmb in that it will only order the two loads in qestion and not, like
> > rmb, any prior to any later load.
> 
> So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> 
> 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> the madatory one can't, so assuming we do not want
> it to be stronger than rmp then either we want
> smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> or we just will specify that dependent_ptr() works for
> both DMA and SMP.
> 
> 2. Down the road, someone might want to order a store after a load.
> Address dependency does that for us too. Assuming we make
> dependent_ptr a NOP on x86, we will want an mb variant
> which isn't a NOP on x86. Will we want to rename
> dependent_ptr to dependent_ptr_rmb at that point?

But x86 preserves store-after-load orderings anyway, and even Alpha
respects ordering from loads to dependent stores.  So what am I missing
here?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 19:02             ` Paul E. McKenney
  0 siblings, 0 replies; 94+ messages in thread
From: Paul E. McKenney @ 2019-01-07 19:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann

On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > 
> > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > +
> > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > +									\
> > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > +})
> > > > > +
> > > > > +#else
> > > > > +
> > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > 
> > > > 
> > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > dependent_ptr_rmb()?
> > > > 
> > > > Thanks
> > > 
> > > You mean for strongly ordered architectures like Intel?
> > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > 
> > > mb variant is unused right now so I'll remove it.
> > 
> > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > implications at all. The address dependency is strictly weaker than an
> > rmb in that it will only order the two loads in qestion and not, like
> > rmb, any prior to any later load.
> 
> So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> 
> 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> the madatory one can't, so assuming we do not want
> it to be stronger than rmp then either we want
> smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> or we just will specify that dependent_ptr() works for
> both DMA and SMP.
> 
> 2. Down the road, someone might want to order a store after a load.
> Address dependency does that for us too. Assuming we make
> dependent_ptr a NOP on x86, we will want an mb variant
> which isn't a NOP on x86. Will we want to rename
> dependent_ptr to dependent_ptr_rmb at that point?

But x86 preserves store-after-load orderings anyway, and even Alpha
respects ordering from loads to dependent stores.  So what am I missing
here?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 19:02             ` Paul E. McKenney
  0 siblings, 0 replies; 94+ messages in thread
From: Paul E. McKenney @ 2019-01-07 19:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann

On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > 
> > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > +
> > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > +									\
> > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > +})
> > > > > +
> > > > > +#else
> > > > > +
> > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > 
> > > > 
> > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > dependent_ptr_rmb()?
> > > > 
> > > > Thanks
> > > 
> > > You mean for strongly ordered architectures like Intel?
> > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > 
> > > mb variant is unused right now so I'll remove it.
> > 
> > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > implications at all. The address dependency is strictly weaker than an
> > rmb in that it will only order the two loads in qestion and not, like
> > rmb, any prior to any later load.
> 
> So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> 
> 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> the madatory one can't, so assuming we do not want
> it to be stronger than rmp then either we want
> smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> or we just will specify that dependent_ptr() works for
> both DMA and SMP.
> 
> 2. Down the road, someone might want to order a store after a load.
> Address dependency does that for us too. Assuming we make
> dependent_ptr a NOP on x86, we will want an mb variant
> which isn't a NOP on x86. Will we want to rename
> dependent_ptr to dependent_ptr_rmb at that point?

But x86 preserves store-after-load orderings anyway, and even Alpha
respects ordering from loads to dependent stores.  So what am I missing
here?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 19:02             ` Paul E. McKenney
  (?)
@ 2019-01-07 19:13               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 19:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > 
> > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > +									\
> > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > +})
> > > > > > +
> > > > > > +#else
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > 
> > > > > 
> > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > dependent_ptr_rmb()?
> > > > > 
> > > > > Thanks
> > > > 
> > > > You mean for strongly ordered architectures like Intel?
> > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > 
> > > > mb variant is unused right now so I'll remove it.
> > > 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> > 
> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> But x86 preserves store-after-load orderings anyway, and even Alpha
> respects ordering from loads to dependent stores.  So what am I missing
> here?
> 
> 							Thanx, Paul

Oh you are right. Stores are not reordered with older loads on x86.

So point 2 is moot. Sorry about the noise.

I guess at this point the only sticking point is the ECC compiler.
I'm inclined to stick an mb() there, seeing as it doesn't even
have spectre protection enabled. Slow but safe.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 19:13               ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 19:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann

On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > 
> > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > +									\
> > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > +})
> > > > > > +
> > > > > > +#else
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > 
> > > > > 
> > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > dependent_ptr_rmb()?
> > > > > 
> > > > > Thanks
> > > > 
> > > > You mean for strongly ordered architectures like Intel?
> > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > 
> > > > mb variant is unused right now so I'll remove it.
> > > 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> > 
> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> But x86 preserves store-after-load orderings anyway, and even Alpha
> respects ordering from loads to dependent stores.  So what am I missing
> here?
> 
> 							Thanx, Paul

Oh you are right. Stores are not reordered with older loads on x86.

So point 2 is moot. Sorry about the noise.

I guess at this point the only sticking point is the ECC compiler.
I'm inclined to stick an mb() there, seeing as it doesn't even
have spectre protection enabled. Slow but safe.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 19:13               ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 19:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann

On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > 
> > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > +									\
> > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > +})
> > > > > > +
> > > > > > +#else
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > 
> > > > > 
> > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > dependent_ptr_rmb()?
> > > > > 
> > > > > Thanks
> > > > 
> > > > You mean for strongly ordered architectures like Intel?
> > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > 
> > > > mb variant is unused right now so I'll remove it.
> > > 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> > 
> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> But x86 preserves store-after-load orderings anyway, and even Alpha
> respects ordering from loads to dependent stores.  So what am I missing
> here?
> 
> 							Thanx, Paul

Oh you are right. Stores are not reordered with older loads on x86.

So point 2 is moot. Sorry about the noise.

I guess at this point the only sticking point is the ECC compiler.
I'm inclined to stick an mb() there, seeing as it doesn't even
have spectre protection enabled. Slow but safe.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 19:02             ` Paul E. McKenney
  (?)
  (?)
@ 2019-01-07 19:13             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-07 19:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Andrea Parri, linux-doc, Peter Zijlstra, Akira Yokosawa,
	Will Deacon, virtualization, David Howells, linux-arch,
	Jonathan Corbet, linux-sparse, Alan Stern, Matt Turner,
	Jade Alglave, Boqun Feng, Arnd Bergmann, Daniel Lustig,
	Nicholas Piggin, Ivan Kokshaysky, Luc Maranget,
	Richard Henderson, netdev, linux-kernel, linux-alpha,
	Luc Van Oostenryck

On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > 
> > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > +									\
> > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > +})
> > > > > > +
> > > > > > +#else
> > > > > > +
> > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > 
> > > > > 
> > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > dependent_ptr_rmb()?
> > > > > 
> > > > > Thanks
> > > > 
> > > > You mean for strongly ordered architectures like Intel?
> > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > 
> > > > mb variant is unused right now so I'll remove it.
> > > 
> > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > implications at all. The address dependency is strictly weaker than an
> > > rmb in that it will only order the two loads in qestion and not, like
> > > rmb, any prior to any later load.
> > 
> > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > 
> > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > the madatory one can't, so assuming we do not want
> > it to be stronger than rmp then either we want
> > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > or we just will specify that dependent_ptr() works for
> > both DMA and SMP.
> > 
> > 2. Down the road, someone might want to order a store after a load.
> > Address dependency does that for us too. Assuming we make
> > dependent_ptr a NOP on x86, we will want an mb variant
> > which isn't a NOP on x86. Will we want to rename
> > dependent_ptr to dependent_ptr_rmb at that point?
> 
> But x86 preserves store-after-load orderings anyway, and even Alpha
> respects ordering from loads to dependent stores.  So what am I missing
> here?
> 
> 							Thanx, Paul

Oh you are right. Stores are not reordered with older loads on x86.

So point 2 is moot. Sorry about the noise.

I guess at this point the only sticking point is the ECC compiler.
I'm inclined to stick an mb() there, seeing as it doesn't even
have spectre protection enabled. Slow but safe.

-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
  2019-01-07 19:13               ` Michael S. Tsirkin
  (?)
@ 2019-01-07 19:25                 ` Paul E. McKenney
  -1 siblings, 0 replies; 94+ messages in thread
From: Paul E. McKenney @ 2019-01-07 19:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann, Luc Van Oostenryck, linux-doc, linux-alpha,
	linux-sparse

On Mon, Jan 07, 2019 at 02:13:29PM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > > 
> > > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > > +
> > > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > > +									\
> > > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > > +})
> > > > > > > +
> > > > > > > +#else
> > > > > > > +
> > > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > > 
> > > > > > 
> > > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > > dependent_ptr_rmb()?
> > > > > > 
> > > > > > Thanks
> > > > > 
> > > > > You mean for strongly ordered architectures like Intel?
> > > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > > 
> > > > > mb variant is unused right now so I'll remove it.
> > > > 
> > > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > > implications at all. The address dependency is strictly weaker than an
> > > > rmb in that it will only order the two loads in qestion and not, like
> > > > rmb, any prior to any later load.
> > > 
> > > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > > 
> > > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > > the madatory one can't, so assuming we do not want
> > > it to be stronger than rmp then either we want
> > > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > > or we just will specify that dependent_ptr() works for
> > > both DMA and SMP.
> > > 
> > > 2. Down the road, someone might want to order a store after a load.
> > > Address dependency does that for us too. Assuming we make
> > > dependent_ptr a NOP on x86, we will want an mb variant
> > > which isn't a NOP on x86. Will we want to rename
> > > dependent_ptr to dependent_ptr_rmb at that point?
> > 
> > But x86 preserves store-after-load orderings anyway, and even Alpha
> > respects ordering from loads to dependent stores.  So what am I missing
> > here?
> > 
> > 							Thanx, Paul
> 
> Oh you are right. Stores are not reordered with older loads on x86.
> 
> So point 2 is moot. Sorry about the noise.
> 
> I guess at this point the only sticking point is the ECC compiler.
> I'm inclined to stick an mb() there, seeing as it doesn't even
> have spectre protection enabled. Slow but safe.

Well, there is a mention of DMA above, which on some systems throws in
a wild card.  I would certainly hope that DMA would integrate nicely
with the cache-coherence protocols these days, unlike 25 years ago,
but who knows?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 19:25                 ` Paul E. McKenney
  0 siblings, 0 replies; 94+ messages in thread
From: Paul E. McKenney @ 2019-01-07 19:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann

On Mon, Jan 07, 2019 at 02:13:29PM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > > 
> > > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > > +
> > > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > > +									\
> > > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > > +})
> > > > > > > +
> > > > > > > +#else
> > > > > > > +
> > > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > > 
> > > > > > 
> > > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > > dependent_ptr_rmb()?
> > > > > > 
> > > > > > Thanks
> > > > > 
> > > > > You mean for strongly ordered architectures like Intel?
> > > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > > 
> > > > > mb variant is unused right now so I'll remove it.
> > > > 
> > > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > > implications at all. The address dependency is strictly weaker than an
> > > > rmb in that it will only order the two loads in qestion and not, like
> > > > rmb, any prior to any later load.
> > > 
> > > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > > 
> > > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > > the madatory one can't, so assuming we do not want
> > > it to be stronger than rmp then either we want
> > > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > > or we just will specify that dependent_ptr() works for
> > > both DMA and SMP.
> > > 
> > > 2. Down the road, someone might want to order a store after a load.
> > > Address dependency does that for us too. Assuming we make
> > > dependent_ptr a NOP on x86, we will want an mb variant
> > > which isn't a NOP on x86. Will we want to rename
> > > dependent_ptr to dependent_ptr_rmb at that point?
> > 
> > But x86 preserves store-after-load orderings anyway, and even Alpha
> > respects ordering from loads to dependent stores.  So what am I missing
> > here?
> > 
> > 							Thanx, Paul
> 
> Oh you are right. Stores are not reordered with older loads on x86.
> 
> So point 2 is moot. Sorry about the noise.
> 
> I guess at this point the only sticking point is the ECC compiler.
> I'm inclined to stick an mb() there, seeing as it doesn't even
> have spectre protection enabled. Slow but safe.

Well, there is a mention of DMA above, which on some systems throws in
a wild card.  I would certainly hope that DMA would integrate nicely
with the cache-coherence protocols these days, unlike 25 years ago,
but who knows?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 3/4] barriers: convert a control to a data dependency
@ 2019-01-07 19:25                 ` Paul E. McKenney
  0 siblings, 0 replies; 94+ messages in thread
From: Paul E. McKenney @ 2019-01-07 19:25 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Peter Zijlstra, Jason Wang, linux-kernel, Alan Stern,
	Andrea Parri, Will Deacon, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization,
	Jonathan Corbet, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Arnd Bergmann

On Mon, Jan 07, 2019 at 02:13:29PM -0500, Michael S. Tsirkin wrote:
> On Mon, Jan 07, 2019 at 11:02:36AM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 07, 2019 at 08:36:36AM -0500, Michael S. Tsirkin wrote:
> > > On Mon, Jan 07, 2019 at 10:46:10AM +0100, Peter Zijlstra wrote:
> > > > On Sun, Jan 06, 2019 at 11:23:07PM -0500, Michael S. Tsirkin wrote:
> > > > > On Mon, Jan 07, 2019 at 11:58:23AM +0800, Jason Wang wrote:
> > > > > > On 2019/1/3 上午4:57, Michael S. Tsirkin wrote:
> > > > 
> > > > > > > +#if defined(COMPILER_HAS_OPTIMIZER_HIDE_VAR) && \
> > > > > > > +	!defined(ARCH_NEEDS_READ_BARRIER_DEPENDS)
> > > > > > > +
> > > > > > > +#define dependent_ptr_mb(ptr, val) ({					\
> > > > > > > +	long dependent_ptr_mb_val = (long)(val);			\
> > > > > > > +	long dependent_ptr_mb_ptr = (long)(ptr) - dependent_ptr_mb_val;	\
> > > > > > > +									\
> > > > > > > +	BUILD_BUG_ON(sizeof(val) > sizeof(long));			\
> > > > > > > +	OPTIMIZER_HIDE_VAR(dependent_ptr_mb_val);			\
> > > > > > > +	(typeof(ptr))(dependent_ptr_mb_ptr + dependent_ptr_mb_val);	\
> > > > > > > +})
> > > > > > > +
> > > > > > > +#else
> > > > > > > +
> > > > > > > +#define dependent_ptr_mb(ptr, val) ({ mb(); (ptr); })
> > > > > > 
> > > > > > 
> > > > > > So for the example of patch 4, we'd better fall back to rmb() or need a
> > > > > > dependent_ptr_rmb()?
> > > > > > 
> > > > > > Thanks
> > > > > 
> > > > > You mean for strongly ordered architectures like Intel?
> > > > > Yes, maybe it makes sense to have dependent_ptr_smp_rmb,
> > > > > dependent_ptr_dma_rmb and dependent_ptr_virt_rmb.
> > > > > 
> > > > > mb variant is unused right now so I'll remove it.
> > > > 
> > > > How about naming the thing: dependent_ptr() ? That is without any (r)mb
> > > > implications at all. The address dependency is strictly weaker than an
> > > > rmb in that it will only order the two loads in qestion and not, like
> > > > rmb, any prior to any later load.
> > > 
> > > So I'm fine with this as it's enough for virtio, but I would like to point out two things:
> > > 
> > > 1. E.g. on x86 both SMP and DMA variants can be NOPs but
> > > the madatory one can't, so assuming we do not want
> > > it to be stronger than rmp then either we want
> > > smp_dependent_ptr(), dma_dependent_ptr(), dependent_ptr()
> > > or we just will specify that dependent_ptr() works for
> > > both DMA and SMP.
> > > 
> > > 2. Down the road, someone might want to order a store after a load.
> > > Address dependency does that for us too. Assuming we make
> > > dependent_ptr a NOP on x86, we will want an mb variant
> > > which isn't a NOP on x86. Will we want to rename
> > > dependent_ptr to dependent_ptr_rmb at that point?
> > 
> > But x86 preserves store-after-load orderings anyway, and even Alpha
> > respects ordering from loads to dependent stores.  So what am I missing
> > here?
> > 
> > 							Thanx, Paul
> 
> Oh you are right. Stores are not reordered with older loads on x86.
> 
> So point 2 is moot. Sorry about the noise.
> 
> I guess at this point the only sticking point is the ECC compiler.
> I'm inclined to stick an mb() there, seeing as it doesn't even
> have spectre protection enabled. Slow but safe.

Well, there is a mention of DMA above, which on some systems throws in
a wild card.  I would certainly hope that DMA would integrate nicely
with the cache-coherence protocols these days, unlike 25 years ago,
but who knows?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-02 20:57   ` Michael S. Tsirkin
  (?)
@ 2019-01-08 17:44     ` Nick Desaulniers
  -1 siblings, 0 replies; 94+ messages in thread
From: Nick Desaulniers @ 2019-01-08 17:44 UTC (permalink / raw)
  To: Michael S. Tsirkin, Miguel Ojeda
  Cc: LKML, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization, Eli Friedman,
	Joe Perches, Linus Torvalds, Luc Van Oostenryck, linux-sparse,
	Eric Christopher

Thanks for the patch and sorry for the delay; was totally unplugged
for the holidays.

On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> from compiler-gcc - instead it gets the version in
> include/linux/compiler.h.  Unfortunately that version doesn't actually
> prevent compiler from optimizing out the variable.

Good catch. Did you find this via eyeballing the code, a test, or some
other way?

>
> Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> Compilers without incline asm support will keep working
> since it's protected by an ifdef.
>
> Also fix up comments to match reality since we are no longer overriding
> any macros.
>
> Build-tested with gcc and clang.
>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> Cc: Eli Friedman <efriedma@codeaurora.org>
> Cc: Joe Perches <joe@perches.com>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Also for more context, see:
commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
dead store elimination")

> ---
>  include/linux/compiler-clang.h | 5 ++---
>  include/linux/compiler-gcc.h   | 4 ----
>  include/linux/compiler-intel.h | 4 +---
>  include/linux/compiler.h       | 4 +++-
>  4 files changed, 6 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 3e7dafb3ea80..7ddaeb5182e3 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -3,9 +3,8 @@
>  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
>  #endif
>
> -/* Some compiler specific definitions are overwritten here
> - * for Clang compiler
> - */
> +/* Compiler specific definitions for Clang compiler */
> +
>  #define uninitialized_var(x) x = *(&(x))
>
>  /* same as gcc, this was present in clang-2.6 so we can assume it works
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index 2010493e1040..72054d9f0eaa 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -58,10 +58,6 @@
>         (typeof(ptr)) (__ptr + (off));                                  \
>  })
>
> -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> -#define OPTIMIZER_HIDE_VAR(var)                                                \
> -       __asm__ ("" : "=r" (var) : "0" (var))
> -
>  /*
>   * A trick to suppress uninitialized variable warning without generating any
>   * code
> diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> index 517bd14e1222..b17f3cd18334 100644
> --- a/include/linux/compiler-intel.h
> +++ b/include/linux/compiler-intel.h
> @@ -5,9 +5,7 @@
>
>  #ifdef __ECC
>
> -/* Some compiler specific definitions are overwritten here
> - * for Intel ECC compiler
> - */
> +/* Compiler specific definitions for Intel ECC compiler */
>
>  #include <asm/intrinsics.h>
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 06396c1cf127..1ad367b4cd8d 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  #endif
>
>  #ifndef OPTIMIZER_HIDE_VAR
> -#define OPTIMIZER_HIDE_VAR(var) barrier()
> +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> +#define OPTIMIZER_HIDE_VAR(var)                                                \
> +       __asm__ ("" : "=r" (var) : "0" (var))
>  #endif

This should be fine, thanks for the cleanup!  For now, we're not yet
confident to turn on Clang's integrated assembler for the kernel, but
I'll make sure to revisit this should we, in case Clang is then able
to optimize this out.
+ Eric, who might know of a better trick for what we're trying to
accomplish with this macro.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

+ Miguel
Miguel, would you mind taking this into your compiler-attributes tree?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-08 17:44     ` Nick Desaulniers
  0 siblings, 0 replies; 94+ messages in thread
From: Nick Desaulniers @ 2019-01-08 17:44 UTC (permalink / raw)
  To: Michael S. Tsirkin, Miguel Ojeda
  Cc: LKML, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization, Eli Friedman,
	Joe Perches, Linus Torvalds

Thanks for the patch and sorry for the delay; was totally unplugged
for the holidays.

On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> from compiler-gcc - instead it gets the version in
> include/linux/compiler.h.  Unfortunately that version doesn't actually
> prevent compiler from optimizing out the variable.

Good catch. Did you find this via eyeballing the code, a test, or some
other way?

>
> Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> Compilers without incline asm support will keep working
> since it's protected by an ifdef.
>
> Also fix up comments to match reality since we are no longer overriding
> any macros.
>
> Build-tested with gcc and clang.
>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> Cc: Eli Friedman <efriedma@codeaurora.org>
> Cc: Joe Perches <joe@perches.com>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Also for more context, see:
commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
dead store elimination")

> ---
>  include/linux/compiler-clang.h | 5 ++---
>  include/linux/compiler-gcc.h   | 4 ----
>  include/linux/compiler-intel.h | 4 +---
>  include/linux/compiler.h       | 4 +++-
>  4 files changed, 6 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 3e7dafb3ea80..7ddaeb5182e3 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -3,9 +3,8 @@
>  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
>  #endif
>
> -/* Some compiler specific definitions are overwritten here
> - * for Clang compiler
> - */
> +/* Compiler specific definitions for Clang compiler */
> +
>  #define uninitialized_var(x) x = *(&(x))
>
>  /* same as gcc, this was present in clang-2.6 so we can assume it works
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index 2010493e1040..72054d9f0eaa 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -58,10 +58,6 @@
>         (typeof(ptr)) (__ptr + (off));                                  \
>  })
>
> -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> -#define OPTIMIZER_HIDE_VAR(var)                                                \
> -       __asm__ ("" : "=r" (var) : "0" (var))
> -
>  /*
>   * A trick to suppress uninitialized variable warning without generating any
>   * code
> diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> index 517bd14e1222..b17f3cd18334 100644
> --- a/include/linux/compiler-intel.h
> +++ b/include/linux/compiler-intel.h
> @@ -5,9 +5,7 @@
>
>  #ifdef __ECC
>
> -/* Some compiler specific definitions are overwritten here
> - * for Intel ECC compiler
> - */
> +/* Compiler specific definitions for Intel ECC compiler */
>
>  #include <asm/intrinsics.h>
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 06396c1cf127..1ad367b4cd8d 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  #endif
>
>  #ifndef OPTIMIZER_HIDE_VAR
> -#define OPTIMIZER_HIDE_VAR(var) barrier()
> +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> +#define OPTIMIZER_HIDE_VAR(var)                                                \
> +       __asm__ ("" : "=r" (var) : "0" (var))
>  #endif

This should be fine, thanks for the cleanup!  For now, we're not yet
confident to turn on Clang's integrated assembler for the kernel, but
I'll make sure to revisit this should we, in case Clang is then able
to optimize this out.
+ Eric, who might know of a better trick for what we're trying to
accomplish with this macro.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

+ Miguel
Miguel, would you mind taking this into your compiler-attributes tree?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-08 17:44     ` Nick Desaulniers
  0 siblings, 0 replies; 94+ messages in thread
From: Nick Desaulniers @ 2019-01-08 17:44 UTC (permalink / raw)
  To: Michael S. Tsirkin, Miguel Ojeda
  Cc: LKML, Jason Wang, Alan Stern, Andrea Parri, Will Deacon,
	Peter Zijlstra, Boqun Feng, Nicholas Piggin, David Howells,
	Jade Alglave, Luc Maranget, Paul E. McKenney, Akira Yokosawa,
	Daniel Lustig, linux-arch, netdev, virtualization, Eli Friedman,
	Joe Perches, Linus Torvalds

Thanks for the patch and sorry for the delay; was totally unplugged
for the holidays.

On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> from compiler-gcc - instead it gets the version in
> include/linux/compiler.h.  Unfortunately that version doesn't actually
> prevent compiler from optimizing out the variable.

Good catch. Did you find this via eyeballing the code, a test, or some
other way?

>
> Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> Compilers without incline asm support will keep working
> since it's protected by an ifdef.
>
> Also fix up comments to match reality since we are no longer overriding
> any macros.
>
> Build-tested with gcc and clang.
>
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> Cc: Eli Friedman <efriedma@codeaurora.org>
> Cc: Joe Perches <joe@perches.com>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Also for more context, see:
commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
dead store elimination")

> ---
>  include/linux/compiler-clang.h | 5 ++---
>  include/linux/compiler-gcc.h   | 4 ----
>  include/linux/compiler-intel.h | 4 +---
>  include/linux/compiler.h       | 4 +++-
>  4 files changed, 6 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 3e7dafb3ea80..7ddaeb5182e3 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -3,9 +3,8 @@
>  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
>  #endif
>
> -/* Some compiler specific definitions are overwritten here
> - * for Clang compiler
> - */
> +/* Compiler specific definitions for Clang compiler */
> +
>  #define uninitialized_var(x) x = *(&(x))
>
>  /* same as gcc, this was present in clang-2.6 so we can assume it works
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index 2010493e1040..72054d9f0eaa 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -58,10 +58,6 @@
>         (typeof(ptr)) (__ptr + (off));                                  \
>  })
>
> -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> -#define OPTIMIZER_HIDE_VAR(var)                                                \
> -       __asm__ ("" : "=r" (var) : "0" (var))
> -
>  /*
>   * A trick to suppress uninitialized variable warning without generating any
>   * code
> diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> index 517bd14e1222..b17f3cd18334 100644
> --- a/include/linux/compiler-intel.h
> +++ b/include/linux/compiler-intel.h
> @@ -5,9 +5,7 @@
>
>  #ifdef __ECC
>
> -/* Some compiler specific definitions are overwritten here
> - * for Intel ECC compiler
> - */
> +/* Compiler specific definitions for Intel ECC compiler */
>
>  #include <asm/intrinsics.h>
>
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 06396c1cf127..1ad367b4cd8d 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  #endif
>
>  #ifndef OPTIMIZER_HIDE_VAR
> -#define OPTIMIZER_HIDE_VAR(var) barrier()
> +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> +#define OPTIMIZER_HIDE_VAR(var)                                                \
> +       __asm__ ("" : "=r" (var) : "0" (var))
>  #endif

This should be fine, thanks for the cleanup!  For now, we're not yet
confident to turn on Clang's integrated assembler for the kernel, but
I'll make sure to revisit this should we, in case Clang is then able
to optimize this out.
+ Eric, who might know of a better trick for what we're trying to
accomplish with this macro.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

+ Miguel
Miguel, would you mind taking this into your compiler-attributes tree?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-08 17:44     ` Nick Desaulniers
  (?)
@ 2019-01-08 18:50       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-08 18:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Miguel Ojeda, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, netdev,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

On Tue, Jan 08, 2019 at 09:44:28AM -0800, Nick Desaulniers wrote:
> Thanks for the patch and sorry for the delay; was totally unplugged
> for the holidays.
> On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> > mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> > from compiler-gcc - instead it gets the version in
> > include/linux/compiler.h.  Unfortunately that version doesn't actually
> > prevent compiler from optimizing out the variable.
> 
> Good catch. Did you find this via eyeballing the code, a test, or some
> other way?

eyeballing

> >
> > Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> > Compilers without incline asm support will keep working
> > since it's protected by an ifdef.
> >
> > Also fix up comments to match reality since we are no longer overriding
> > any macros.
> >
> > Build-tested with gcc and clang.
> >
> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> > Cc: Eli Friedman <efriedma@codeaurora.org>
> > Cc: Joe Perches <joe@perches.com>
> > Cc: Nick Desaulniers <ndesaulniers@google.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")


Interesting. That added this text:

 * while gcc behavior gets along with a normal
 * barrier(), llvm needs an explicit input variable to be assumed
 * clobbered.

however:
#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")

So no explicit variable is clobbered.
Weird isn't it?



> > ---
> >  include/linux/compiler-clang.h | 5 ++---
> >  include/linux/compiler-gcc.h   | 4 ----
> >  include/linux/compiler-intel.h | 4 +---
> >  include/linux/compiler.h       | 4 +++-
> >  4 files changed, 6 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> > index 3e7dafb3ea80..7ddaeb5182e3 100644
> > --- a/include/linux/compiler-clang.h
> > +++ b/include/linux/compiler-clang.h
> > @@ -3,9 +3,8 @@
> >  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
> >  #endif
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Clang compiler
> > - */
> > +/* Compiler specific definitions for Clang compiler */
> > +
> >  #define uninitialized_var(x) x = *(&(x))
> >
> >  /* same as gcc, this was present in clang-2.6 so we can assume it works
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index 2010493e1040..72054d9f0eaa 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -58,10 +58,6 @@
> >         (typeof(ptr)) (__ptr + (off));                                  \
> >  })
> >
> > -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > -#define OPTIMIZER_HIDE_VAR(var)                                                \
> > -       __asm__ ("" : "=r" (var) : "0" (var))
> > -
> >  /*
> >   * A trick to suppress uninitialized variable warning without generating any
> >   * code
> > diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> > index 517bd14e1222..b17f3cd18334 100644
> > --- a/include/linux/compiler-intel.h
> > +++ b/include/linux/compiler-intel.h
> > @@ -5,9 +5,7 @@
> >
> >  #ifdef __ECC
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Intel ECC compiler
> > - */
> > +/* Compiler specific definitions for Intel ECC compiler */
> >
> >  #include <asm/intrinsics.h>
> >
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 06396c1cf127..1ad367b4cd8d 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >  #endif
> >
> >  #ifndef OPTIMIZER_HIDE_VAR
> > -#define OPTIMIZER_HIDE_VAR(var) barrier()
> > +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > +#define OPTIMIZER_HIDE_VAR(var)                                                \
> > +       __asm__ ("" : "=r" (var) : "0" (var))
> >  #endif
> 
> This should be fine, thanks for the cleanup!  For now, we're not yet
> confident to turn on Clang's integrated assembler for the kernel, but
> I'll make sure to revisit this should we, in case Clang is then able
> to optimize this out.
> + Eric, who might know of a better trick for what we're trying to
> accomplish with this macro.
> 
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> 
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?
> -- 
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-08 18:50       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-08 18:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Miguel Ojeda, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, netdev,
	virtualization, Eli Friedman, Joe Perches

On Tue, Jan 08, 2019 at 09:44:28AM -0800, Nick Desaulniers wrote:
> Thanks for the patch and sorry for the delay; was totally unplugged
> for the holidays.
> On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> > mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> > from compiler-gcc - instead it gets the version in
> > include/linux/compiler.h.  Unfortunately that version doesn't actually
> > prevent compiler from optimizing out the variable.
> 
> Good catch. Did you find this via eyeballing the code, a test, or some
> other way?

eyeballing

> >
> > Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> > Compilers without incline asm support will keep working
> > since it's protected by an ifdef.
> >
> > Also fix up comments to match reality since we are no longer overriding
> > any macros.
> >
> > Build-tested with gcc and clang.
> >
> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> > Cc: Eli Friedman <efriedma@codeaurora.org>
> > Cc: Joe Perches <joe@perches.com>
> > Cc: Nick Desaulniers <ndesaulniers@google.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")


Interesting. That added this text:

 * while gcc behavior gets along with a normal
 * barrier(), llvm needs an explicit input variable to be assumed
 * clobbered.

however:
#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")

So no explicit variable is clobbered.
Weird isn't it?



> > ---
> >  include/linux/compiler-clang.h | 5 ++---
> >  include/linux/compiler-gcc.h   | 4 ----
> >  include/linux/compiler-intel.h | 4 +---
> >  include/linux/compiler.h       | 4 +++-
> >  4 files changed, 6 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> > index 3e7dafb3ea80..7ddaeb5182e3 100644
> > --- a/include/linux/compiler-clang.h
> > +++ b/include/linux/compiler-clang.h
> > @@ -3,9 +3,8 @@
> >  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
> >  #endif
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Clang compiler
> > - */
> > +/* Compiler specific definitions for Clang compiler */
> > +
> >  #define uninitialized_var(x) x = *(&(x))
> >
> >  /* same as gcc, this was present in clang-2.6 so we can assume it works
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index 2010493e1040..72054d9f0eaa 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -58,10 +58,6 @@
> >         (typeof(ptr)) (__ptr + (off));                                  \
> >  })
> >
> > -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > -#define OPTIMIZER_HIDE_VAR(var)                                                \
> > -       __asm__ ("" : "=r" (var) : "0" (var))
> > -
> >  /*
> >   * A trick to suppress uninitialized variable warning without generating any
> >   * code
> > diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> > index 517bd14e1222..b17f3cd18334 100644
> > --- a/include/linux/compiler-intel.h
> > +++ b/include/linux/compiler-intel.h
> > @@ -5,9 +5,7 @@
> >
> >  #ifdef __ECC
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Intel ECC compiler
> > - */
> > +/* Compiler specific definitions for Intel ECC compiler */
> >
> >  #include <asm/intrinsics.h>
> >
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 06396c1cf127..1ad367b4cd8d 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >  #endif
> >
> >  #ifndef OPTIMIZER_HIDE_VAR
> > -#define OPTIMIZER_HIDE_VAR(var) barrier()
> > +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > +#define OPTIMIZER_HIDE_VAR(var)                                                \
> > +       __asm__ ("" : "=r" (var) : "0" (var))
> >  #endif
> 
> This should be fine, thanks for the cleanup!  For now, we're not yet
> confident to turn on Clang's integrated assembler for the kernel, but
> I'll make sure to revisit this should we, in case Clang is then able
> to optimize this out.
> + Eric, who might know of a better trick for what we're trying to
> accomplish with this macro.
> 
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> 
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?
> -- 
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-08 18:50       ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-08 18:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Miguel Ojeda, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, netdev,
	virtualization, Eli Friedman, Joe Perches

On Tue, Jan 08, 2019 at 09:44:28AM -0800, Nick Desaulniers wrote:
> Thanks for the patch and sorry for the delay; was totally unplugged
> for the holidays.
> On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> > mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> > from compiler-gcc - instead it gets the version in
> > include/linux/compiler.h.  Unfortunately that version doesn't actually
> > prevent compiler from optimizing out the variable.
> 
> Good catch. Did you find this via eyeballing the code, a test, or some
> other way?

eyeballing

> >
> > Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> > Compilers without incline asm support will keep working
> > since it's protected by an ifdef.
> >
> > Also fix up comments to match reality since we are no longer overriding
> > any macros.
> >
> > Build-tested with gcc and clang.
> >
> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> > Cc: Eli Friedman <efriedma@codeaurora.org>
> > Cc: Joe Perches <joe@perches.com>
> > Cc: Nick Desaulniers <ndesaulniers@google.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")


Interesting. That added this text:

 * while gcc behavior gets along with a normal
 * barrier(), llvm needs an explicit input variable to be assumed
 * clobbered.

however:
#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")

So no explicit variable is clobbered.
Weird isn't it?



> > ---
> >  include/linux/compiler-clang.h | 5 ++---
> >  include/linux/compiler-gcc.h   | 4 ----
> >  include/linux/compiler-intel.h | 4 +---
> >  include/linux/compiler.h       | 4 +++-
> >  4 files changed, 6 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> > index 3e7dafb3ea80..7ddaeb5182e3 100644
> > --- a/include/linux/compiler-clang.h
> > +++ b/include/linux/compiler-clang.h
> > @@ -3,9 +3,8 @@
> >  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
> >  #endif
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Clang compiler
> > - */
> > +/* Compiler specific definitions for Clang compiler */
> > +
> >  #define uninitialized_var(x) x = *(&(x))
> >
> >  /* same as gcc, this was present in clang-2.6 so we can assume it works
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index 2010493e1040..72054d9f0eaa 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -58,10 +58,6 @@
> >         (typeof(ptr)) (__ptr + (off));                                  \
> >  })
> >
> > -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > -#define OPTIMIZER_HIDE_VAR(var)                                                \
> > -       __asm__ ("" : "=r" (var) : "0" (var))
> > -
> >  /*
> >   * A trick to suppress uninitialized variable warning without generating any
> >   * code
> > diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> > index 517bd14e1222..b17f3cd18334 100644
> > --- a/include/linux/compiler-intel.h
> > +++ b/include/linux/compiler-intel.h
> > @@ -5,9 +5,7 @@
> >
> >  #ifdef __ECC
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Intel ECC compiler
> > - */
> > +/* Compiler specific definitions for Intel ECC compiler */
> >
> >  #include <asm/intrinsics.h>
> >
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 06396c1cf127..1ad367b4cd8d 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >  #endif
> >
> >  #ifndef OPTIMIZER_HIDE_VAR
> > -#define OPTIMIZER_HIDE_VAR(var) barrier()
> > +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > +#define OPTIMIZER_HIDE_VAR(var)                                                \
> > +       __asm__ ("" : "=r" (var) : "0" (var))
> >  #endif
> 
> This should be fine, thanks for the cleanup!  For now, we're not yet
> confident to turn on Clang's integrated assembler for the kernel, but
> I'll make sure to revisit this should we, in case Clang is then able
> to optimize this out.
> + Eric, who might know of a better trick for what we're trying to
> accomplish with this macro.
> 
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> 
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?
> -- 
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-08 17:44     ` Nick Desaulniers
                       ` (2 preceding siblings ...)
  (?)
@ 2019-01-08 18:50     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-08 18:50 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Andrea Parri, Peter Zijlstra, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, linux-sparse,
	Alan Stern, Paul E. McKenney, Boqun Feng, Daniel Lustig,
	Nicholas Piggin, Luc Maranget, Eli Friedman, Jade Alglave,
	netdev, LKML, Eric Christopher, Miguel Ojeda, Joe Perches,
	Linus Torvalds, Luc Van Oostenryck

On Tue, Jan 08, 2019 at 09:44:28AM -0800, Nick Desaulniers wrote:
> Thanks for the patch and sorry for the delay; was totally unplugged
> for the holidays.
> On Wed, Jan 2, 2019 at 12:57 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > Since commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h
> > mutually exclusive") clang no longer reuses the OPTIMIZER_HIDE_VAR macro
> > from compiler-gcc - instead it gets the version in
> > include/linux/compiler.h.  Unfortunately that version doesn't actually
> > prevent compiler from optimizing out the variable.
> 
> Good catch. Did you find this via eyeballing the code, a test, or some
> other way?

eyeballing

> >
> > Fix up by moving the macro out from compiler-gcc.h to compiler.h.
> > Compilers without incline asm support will keep working
> > since it's protected by an ifdef.
> >
> > Also fix up comments to match reality since we are no longer overriding
> > any macros.
> >
> > Build-tested with gcc and clang.
> >
> > Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive")
> > Cc: Eli Friedman <efriedma@codeaurora.org>
> > Cc: Joe Perches <joe@perches.com>
> > Cc: Nick Desaulniers <ndesaulniers@google.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")


Interesting. That added this text:

 * while gcc behavior gets along with a normal
 * barrier(), llvm needs an explicit input variable to be assumed
 * clobbered.

however:
#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")

So no explicit variable is clobbered.
Weird isn't it?



> > ---
> >  include/linux/compiler-clang.h | 5 ++---
> >  include/linux/compiler-gcc.h   | 4 ----
> >  include/linux/compiler-intel.h | 4 +---
> >  include/linux/compiler.h       | 4 +++-
> >  4 files changed, 6 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> > index 3e7dafb3ea80..7ddaeb5182e3 100644
> > --- a/include/linux/compiler-clang.h
> > +++ b/include/linux/compiler-clang.h
> > @@ -3,9 +3,8 @@
> >  #error "Please don't include <linux/compiler-clang.h> directly, include <linux/compiler.h> instead."
> >  #endif
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Clang compiler
> > - */
> > +/* Compiler specific definitions for Clang compiler */
> > +
> >  #define uninitialized_var(x) x = *(&(x))
> >
> >  /* same as gcc, this was present in clang-2.6 so we can assume it works
> > diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> > index 2010493e1040..72054d9f0eaa 100644
> > --- a/include/linux/compiler-gcc.h
> > +++ b/include/linux/compiler-gcc.h
> > @@ -58,10 +58,6 @@
> >         (typeof(ptr)) (__ptr + (off));                                  \
> >  })
> >
> > -/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > -#define OPTIMIZER_HIDE_VAR(var)                                                \
> > -       __asm__ ("" : "=r" (var) : "0" (var))
> > -
> >  /*
> >   * A trick to suppress uninitialized variable warning without generating any
> >   * code
> > diff --git a/include/linux/compiler-intel.h b/include/linux/compiler-intel.h
> > index 517bd14e1222..b17f3cd18334 100644
> > --- a/include/linux/compiler-intel.h
> > +++ b/include/linux/compiler-intel.h
> > @@ -5,9 +5,7 @@
> >
> >  #ifdef __ECC
> >
> > -/* Some compiler specific definitions are overwritten here
> > - * for Intel ECC compiler
> > - */
> > +/* Compiler specific definitions for Intel ECC compiler */
> >
> >  #include <asm/intrinsics.h>
> >
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 06396c1cf127..1ad367b4cd8d 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -152,7 +152,9 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
> >  #endif
> >
> >  #ifndef OPTIMIZER_HIDE_VAR
> > -#define OPTIMIZER_HIDE_VAR(var) barrier()
> > +/* Make the optimizer believe the variable can be manipulated arbitrarily. */
> > +#define OPTIMIZER_HIDE_VAR(var)                                                \
> > +       __asm__ ("" : "=r" (var) : "0" (var))
> >  #endif
> 
> This should be fine, thanks for the cleanup!  For now, we're not yet
> confident to turn on Clang's integrated assembler for the kernel, but
> I'll make sure to revisit this should we, in case Clang is then able
> to optimize this out.
> + Eric, who might know of a better trick for what we're trying to
> accomplish with this macro.
> 
> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> 
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?
> -- 
> Thanks,
> ~Nick Desaulniers

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-08 17:44     ` Nick Desaulniers
  (?)
@ 2019-01-09 10:35       ` Miguel Ojeda
  -1 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-09 10:35 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Michael S. Tsirkin, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")

By the way, shouldn't that barrier_data() be directly in compiler.h
too, since it is for both gcc & clang?

> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
>
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?

Sure, at least we get quickly some linux-next time.

Note it would be nice to separate the patch into two (one for the
comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
for barrier_data().

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-09 10:35       ` Miguel Ojeda
  0 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-09 10:35 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Michael S. Tsirkin, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches

On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")

By the way, shouldn't that barrier_data() be directly in compiler.h
too, since it is for both gcc & clang?

> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
>
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?

Sure, at least we get quickly some linux-next time.

Note it would be nice to separate the patch into two (one for the
comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
for barrier_data().

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-09 10:35       ` Miguel Ojeda
  0 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-09 10:35 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Michael S. Tsirkin, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches

On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> Also for more context, see:
> commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> dead store elimination")

By the way, shouldn't that barrier_data() be directly in compiler.h
too, since it is for both gcc & clang?

> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
>
> + Miguel
> Miguel, would you mind taking this into your compiler-attributes tree?

Sure, at least we get quickly some linux-next time.

Note it would be nice to separate the patch into two (one for the
comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
for barrier_data().

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-09 10:35       ` Miguel Ojeda
  (?)
@ 2019-01-09 14:50         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-09 14:50 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.
> 
> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

Okay, I will try.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-09 14:50         ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-09 14:50 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.
> 
> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

Okay, I will try.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-09 14:50         ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-09 14:50 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.
> 
> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

Okay, I will try.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-09 10:35       ` Miguel Ojeda
                         ` (2 preceding siblings ...)
  (?)
@ 2019-01-09 14:50       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-09 14:50 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Andrea Parri, Peter Zijlstra, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, linux-sparse,
	Alan Stern, Paul E. McKenney, Boqun Feng, Daniel Lustig,
	Nicholas Piggin, Luc Maranget, Eli Friedman, Jade Alglave,
	Network Development, Nick Desaulniers, LKML, Eric Christopher,
	Joe Perches, Linus Torvalds, Luc Van Oostenryck

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.
> 
> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

Okay, I will try.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-09 10:35       ` Miguel Ojeda
  (?)
@ 2019-01-10  2:36         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10  2:36 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.


BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.

> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-10  2:36         ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10  2:36 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.


BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.

> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-10  2:36         ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10  2:36 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.


BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.

> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-09 10:35       ` Miguel Ojeda
                         ` (4 preceding siblings ...)
  (?)
@ 2019-01-10  2:36       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10  2:36 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Andrea Parri, Peter Zijlstra, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, linux-sparse,
	Alan Stern, Paul E. McKenney, Boqun Feng, Daniel Lustig,
	Nicholas Piggin, Luc Maranget, Eli Friedman, Jade Alglave,
	Network Development, Nick Desaulniers, LKML, Eric Christopher,
	Joe Perches, Linus Torvalds, Luc Van Oostenryck

On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > Also for more context, see:
> > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > dead store elimination")
> 
> By the way, shouldn't that barrier_data() be directly in compiler.h
> too, since it is for both gcc & clang?
> 
> > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> >
> > + Miguel
> > Miguel, would you mind taking this into your compiler-attributes tree?
> 
> Sure, at least we get quickly some linux-next time.


BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.

> Note it would be nice to separate the patch into two (one for the
> comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> for barrier_data().
> 
> Cheers,
> Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-10  2:36         ` Michael S. Tsirkin
  (?)
@ 2019-01-10 13:41           ` Dan Carpenter
  -1 siblings, 0 replies; 94+ messages in thread
From: Dan Carpenter @ 2019-01-10 13:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Miguel Ojeda, Nick Desaulniers, LKML, Jason Wang, Alan Stern,
	Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, linux-arch,
	Network Development, virtualization, Eli Friedman, Joe Perches,
	Linus Torvalds, Luc Van Oostenryck, linux-sparse,
	Eric Christopher

On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > Also for more context, see:
> > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > dead store elimination")
> > 
> > By the way, shouldn't that barrier_data() be directly in compiler.h
> > too, since it is for both gcc & clang?
> > 
> > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > >
> > > + Miguel
> > > Miguel, would you mind taking this into your compiler-attributes tree?
> > 
> > Sure, at least we get quickly some linux-next time.
> 
> 
> BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> 

It doesn't hurt to put things in linux-next for a week and then 5.0 and
-stable.  Not a lot of testing happens on linux-next, but some does.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-10 13:41           ` Dan Carpenter
  0 siblings, 0 replies; 94+ messages in thread
From: Dan Carpenter @ 2019-01-10 13:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Miguel Ojeda, Nick Desaulniers, LKML, Jason Wang, Alan Stern,
	Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, linux-arch,
	Network Development, virtualization, Eli

On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > Also for more context, see:
> > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > dead store elimination")
> > 
> > By the way, shouldn't that barrier_data() be directly in compiler.h
> > too, since it is for both gcc & clang?
> > 
> > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > >
> > > + Miguel
> > > Miguel, would you mind taking this into your compiler-attributes tree?
> > 
> > Sure, at least we get quickly some linux-next time.
> 
> 
> BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> 

It doesn't hurt to put things in linux-next for a week and then 5.0 and
-stable.  Not a lot of testing happens on linux-next, but some does.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-10 13:41           ` Dan Carpenter
  0 siblings, 0 replies; 94+ messages in thread
From: Dan Carpenter @ 2019-01-10 13:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Miguel Ojeda, Nick Desaulniers, LKML, Jason Wang, Alan Stern,
	Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, linux-arch,
	Network Development, virtualization, Eli

On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > Also for more context, see:
> > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > dead store elimination")
> > 
> > By the way, shouldn't that barrier_data() be directly in compiler.h
> > too, since it is for both gcc & clang?
> > 
> > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > >
> > > + Miguel
> > > Miguel, would you mind taking this into your compiler-attributes tree?
> > 
> > Sure, at least we get quickly some linux-next time.
> 
> 
> BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> 

It doesn't hurt to put things in linux-next for a week and then 5.0 and
-stable.  Not a lot of testing happens on linux-next, but some does.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-10  2:36         ` Michael S. Tsirkin
                           ` (2 preceding siblings ...)
  (?)
@ 2019-01-10 13:41         ` Dan Carpenter
  -1 siblings, 0 replies; 94+ messages in thread
From: Dan Carpenter @ 2019-01-10 13:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrea Parri, Peter Zijlstra, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, linux-sparse,
	Alan Stern, Paul E. McKenney, Boqun Feng, Daniel Lustig,
	Nicholas Piggin, Luc Maranget, Eli Friedman, Jade Alglave,
	Network Development, Nick Desaulniers, LKML, Eric Christopher,
	Miguel Ojeda, Joe Perches, Linus Torvalds

On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > Also for more context, see:
> > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > dead store elimination")
> > 
> > By the way, shouldn't that barrier_data() be directly in compiler.h
> > too, since it is for both gcc & clang?
> > 
> > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > >
> > > + Miguel
> > > Miguel, would you mind taking this into your compiler-attributes tree?
> > 
> > Sure, at least we get quickly some linux-next time.
> 
> 
> BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> 

It doesn't hurt to put things in linux-next for a week and then 5.0 and
-stable.  Not a lot of testing happens on linux-next, but some does.

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-10 13:41           ` Dan Carpenter
  (?)
@ 2019-01-10 14:08             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10 14:08 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Miguel Ojeda, Nick Desaulniers, LKML, Jason Wang, Alan Stern,
	Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, linux-arch,
	Network Development, virtualization, Eli Friedman, Joe Perches,
	Linus Torvalds, Luc Van Oostenryck, linux-sparse,
	Eric Christopher

On Thu, Jan 10, 2019 at 04:41:39PM +0300, Dan Carpenter wrote:
> On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> > On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > >
> > > > Also for more context, see:
> > > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > > dead store elimination")
> > > 
> > > By the way, shouldn't that barrier_data() be directly in compiler.h
> > > too, since it is for both gcc & clang?
> > > 
> > > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > > >
> > > > + Miguel
> > > > Miguel, would you mind taking this into your compiler-attributes tree?
> > > 
> > > Sure, at least we get quickly some linux-next time.
> > 
> > 
> > BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> > 
> 
> It doesn't hurt to put things in linux-next for a week and then 5.0 and
> -stable.  Not a lot of testing happens on linux-next, but some does.
> 
> regards,
> dan carpenter

I misunderstood. Sure that makes sense.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-10 14:08             ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10 14:08 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Miguel Ojeda, Nick Desaulniers, LKML, Jason Wang, Alan Stern,
	Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, linux-arch,
	Network Development, virtualization, Eli

On Thu, Jan 10, 2019 at 04:41:39PM +0300, Dan Carpenter wrote:
> On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> > On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > >
> > > > Also for more context, see:
> > > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > > dead store elimination")
> > > 
> > > By the way, shouldn't that barrier_data() be directly in compiler.h
> > > too, since it is for both gcc & clang?
> > > 
> > > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > > >
> > > > + Miguel
> > > > Miguel, would you mind taking this into your compiler-attributes tree?
> > > 
> > > Sure, at least we get quickly some linux-next time.
> > 
> > 
> > BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> > 
> 
> It doesn't hurt to put things in linux-next for a week and then 5.0 and
> -stable.  Not a lot of testing happens on linux-next, but some does.
> 
> regards,
> dan carpenter

I misunderstood. Sure that makes sense.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-10 14:08             ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-10 14:08 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Miguel Ojeda, Nick Desaulniers, LKML, Jason Wang, Alan Stern,
	Andrea Parri, Will Deacon, Peter Zijlstra, Boqun Feng,
	Nicholas Piggin, David Howells, Jade Alglave, Luc Maranget,
	Paul E. McKenney, Akira Yokosawa, Daniel Lustig, linux-arch,
	Network Development, virtualization, Eli

On Thu, Jan 10, 2019 at 04:41:39PM +0300, Dan Carpenter wrote:
> On Wed, Jan 09, 2019 at 09:36:41PM -0500, Michael S. Tsirkin wrote:
> > On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > > On Tue, Jan 8, 2019 at 6:44 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > >
> > > > Also for more context, see:
> > > > commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
> > > > dead store elimination")
> > > 
> > > By the way, shouldn't that barrier_data() be directly in compiler.h
> > > too, since it is for both gcc & clang?
> > > 
> > > > Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
> > > >
> > > > + Miguel
> > > > Miguel, would you mind taking this into your compiler-attributes tree?
> > > 
> > > Sure, at least we get quickly some linux-next time.
> > 
> > 
> > BTW why linux-next? shouldn't this go into 5.0 and stable? It's a bugfix after all.
> > 
> 
> It doesn't hurt to put things in linux-next for a week and then 5.0 and
> -stable.  Not a lot of testing happens on linux-next, but some does.
> 
> regards,
> dan carpenter

I misunderstood. Sure that makes sense.

-- 
MST

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-09 14:50         ` Michael S. Tsirkin
@ 2019-01-19 18:35           ` Miguel Ojeda
  -1 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-19 18:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

Hi Michael,

On Wed, Jan 9, 2019 at 3:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > Note it would be nice to separate the patch into two (one for the
> > comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> > for barrier_data().
>
> Okay, I will try.

Did you have a chance to do it (or maybe I missed the patches)? If
not, no worries, I can send this to Linus as it is and get it in
already, then we can do the barrier_data later.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-19 18:35           ` Miguel Ojeda
  0 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-19 18:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman

Hi Michael,

On Wed, Jan 9, 2019 at 3:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > Note it would be nice to separate the patch into two (one for the
> > comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> > for barrier_data().
>
> Okay, I will try.

Did you have a chance to do it (or maybe I missed the patches)? If
not, no worries, I can send this to Linus as it is and get it in
already, then we can do the barrier_data later.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-19 18:35           ` Miguel Ojeda
@ 2019-01-20 14:43             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-20 14:43 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

On Sat, Jan 19, 2019 at 07:35:33PM +0100, Miguel Ojeda wrote:
> Hi Michael,
> 
> On Wed, Jan 9, 2019 at 3:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > > Note it would be nice to separate the patch into two (one for the
> > > comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> > > for barrier_data().
> >
> > Okay, I will try.
> 
> Did you have a chance to do it (or maybe I missed the patches)? If
> not, no worries, I can send this to Linus as it is and get it in
> already, then we can do the barrier_data later.
> 
> Cheers,
> Miguel

No not yet. Sorry! Pls send this one in, barrier_data will likely miss
the next merge window.

-- 
MSR

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-20 14:43             ` Michael S. Tsirkin
  0 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-20 14:43 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman

On Sat, Jan 19, 2019 at 07:35:33PM +0100, Miguel Ojeda wrote:
> Hi Michael,
> 
> On Wed, Jan 9, 2019 at 3:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > > Note it would be nice to separate the patch into two (one for the
> > > comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> > > for barrier_data().
> >
> > Okay, I will try.
> 
> Did you have a chance to do it (or maybe I missed the patches)? If
> not, no worries, I can send this to Linus as it is and get it in
> already, then we can do the barrier_data later.
> 
> Cheers,
> Miguel

No not yet. Sorry! Pls send this one in, barrier_data will likely miss
the next merge window.

-- 
MSR

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-19 18:35           ` Miguel Ojeda
  (?)
@ 2019-01-20 14:43           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 94+ messages in thread
From: Michael S. Tsirkin @ 2019-01-20 14:43 UTC (permalink / raw)
  To: Miguel Ojeda
  Cc: Andrea Parri, Peter Zijlstra, Akira Yokosawa, Will Deacon,
	virtualization, David Howells, linux-arch, linux-sparse,
	Alan Stern, Paul E. McKenney, Boqun Feng, Daniel Lustig,
	Nicholas Piggin, Luc Maranget, Eli Friedman, Jade Alglave,
	Network Development, Nick Desaulniers, LKML, Eric Christopher,
	Joe Perches, Linus Torvalds, Luc Van Oostenryck

On Sat, Jan 19, 2019 at 07:35:33PM +0100, Miguel Ojeda wrote:
> Hi Michael,
> 
> On Wed, Jan 9, 2019 at 3:50 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Wed, Jan 09, 2019 at 11:35:52AM +0100, Miguel Ojeda wrote:
> > > Note it would be nice to separate the patch into two (one for the
> > > comments, another for OPTIMIZER_HIDE_VAR), and also possibly another
> > > for barrier_data().
> >
> > Okay, I will try.
> 
> Did you have a chance to do it (or maybe I missed the patches)? If
> not, no worries, I can send this to Linus as it is and get it in
> already, then we can do the barrier_data later.
> 
> Cheers,
> Miguel

No not yet. Sorry! Pls send this one in, barrier_data will likely miss
the next merge window.

-- 
MSR

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
  2019-01-20 14:43             ` Michael S. Tsirkin
@ 2019-01-20 15:36               ` Miguel Ojeda
  -1 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-20 15:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman, Joe Perches, Linus Torvalds,
	Luc Van Oostenryck, linux-sparse, Eric Christopher

On Sun, Jan 20, 2019 at 3:43 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> No not yet. Sorry! Pls send this one in, barrier_data will likely miss
> the next merge window.

No worries! Done.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

* Re: [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR
@ 2019-01-20 15:36               ` Miguel Ojeda
  0 siblings, 0 replies; 94+ messages in thread
From: Miguel Ojeda @ 2019-01-20 15:36 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Nick Desaulniers, LKML, Jason Wang, Alan Stern, Andrea Parri,
	Will Deacon, Peter Zijlstra, Boqun Feng, Nicholas Piggin,
	David Howells, Jade Alglave, Luc Maranget, Paul E. McKenney,
	Akira Yokosawa, Daniel Lustig, linux-arch, Network Development,
	virtualization, Eli Friedman

On Sun, Jan 20, 2019 at 3:43 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> No not yet. Sorry! Pls send this one in, barrier_data will likely miss
> the next merge window.

No worries! Done.

Cheers,
Miguel

^ permalink raw reply	[flat|nested] 94+ messages in thread

end of thread, other threads:[~2019-01-20 15:36 UTC | newest]

Thread overview: 94+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-02 20:57 [PATCH RFC 0/4] barriers using data dependency Michael S. Tsirkin
2019-01-02 20:57 ` Michael S. Tsirkin
2019-01-02 20:57 ` [PATCH RFC 1/4] include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR Michael S. Tsirkin
2019-01-02 20:57   ` Michael S. Tsirkin
2019-01-02 20:57   ` Michael S. Tsirkin
2019-01-08 17:44   ` Nick Desaulniers
2019-01-08 17:44     ` Nick Desaulniers
2019-01-08 17:44     ` Nick Desaulniers
2019-01-08 18:50     ` Michael S. Tsirkin
2019-01-08 18:50       ` Michael S. Tsirkin
2019-01-08 18:50       ` Michael S. Tsirkin
2019-01-08 18:50     ` Michael S. Tsirkin
2019-01-09 10:35     ` Miguel Ojeda
2019-01-09 10:35       ` Miguel Ojeda
2019-01-09 10:35       ` Miguel Ojeda
2019-01-09 14:50       ` Michael S. Tsirkin
2019-01-09 14:50         ` Michael S. Tsirkin
2019-01-09 14:50         ` Michael S. Tsirkin
2019-01-19 18:35         ` Miguel Ojeda
2019-01-19 18:35           ` Miguel Ojeda
2019-01-20 14:43           ` Michael S. Tsirkin
2019-01-20 14:43           ` Michael S. Tsirkin
2019-01-20 14:43             ` Michael S. Tsirkin
2019-01-20 15:36             ` Miguel Ojeda
2019-01-20 15:36               ` Miguel Ojeda
2019-01-09 14:50       ` Michael S. Tsirkin
2019-01-10  2:36       ` Michael S. Tsirkin
2019-01-10  2:36         ` Michael S. Tsirkin
2019-01-10  2:36         ` Michael S. Tsirkin
2019-01-10 13:41         ` Dan Carpenter
2019-01-10 13:41           ` Dan Carpenter
2019-01-10 13:41           ` Dan Carpenter
2019-01-10 14:08           ` Michael S. Tsirkin
2019-01-10 14:08             ` Michael S. Tsirkin
2019-01-10 14:08             ` Michael S. Tsirkin
2019-01-10 13:41         ` Dan Carpenter
2019-01-10  2:36       ` Michael S. Tsirkin
2019-01-02 20:57 ` Michael S. Tsirkin
2019-01-02 20:57 ` [PATCH RFC 2/4] include/linux/compiler.h: allow memory operands Michael S. Tsirkin
2019-01-02 20:57 ` Michael S. Tsirkin
2019-01-07 17:54   ` Will Deacon
2019-01-07 17:54     ` Will Deacon
2019-01-07 18:16     ` Michael S. Tsirkin
2019-01-07 18:16       ` Michael S. Tsirkin
2019-01-02 20:57 ` [PATCH RFC 3/4] barriers: convert a control to a data dependency Michael S. Tsirkin
2019-01-02 20:57   ` Michael S. Tsirkin
2019-01-02 20:57   ` Michael S. Tsirkin
2019-01-02 21:00   ` Matthew Wilcox
2019-01-02 21:00   ` Matthew Wilcox
2019-01-02 21:00     ` Matthew Wilcox
2019-01-02 21:00     ` Matthew Wilcox
2019-01-02 21:24     ` Michael S. Tsirkin
2019-01-02 21:24     ` Michael S. Tsirkin
2019-01-02 21:24       ` Michael S. Tsirkin
2019-01-02 21:24       ` Michael S. Tsirkin
2019-01-07  3:58   ` Jason Wang
2019-01-07  3:58     ` Jason Wang
2019-01-07  4:23     ` Michael S. Tsirkin
2019-01-07  4:23       ` Michael S. Tsirkin
2019-01-07  4:23       ` Michael S. Tsirkin
2019-01-07  4:23       ` Michael S. Tsirkin
2019-01-07  6:50       ` Jason Wang
2019-01-07  6:50         ` Jason Wang
2019-01-07  6:50         ` Jason Wang
2019-01-07  6:50         ` Jason Wang
2019-01-07  9:46       ` Peter Zijlstra
2019-01-07  9:46         ` Peter Zijlstra
2019-01-07 13:36         ` Michael S. Tsirkin
2019-01-07 13:36           ` Michael S. Tsirkin
2019-01-07 15:54           ` Peter Zijlstra
2019-01-07 15:54             ` Peter Zijlstra
2019-01-07 16:22             ` Michael S. Tsirkin
2019-01-07 16:22               ` Michael S. Tsirkin
2019-01-07 16:22               ` Michael S. Tsirkin
2019-01-07 16:22             ` Michael S. Tsirkin
2019-01-07 19:02           ` Paul E. McKenney
2019-01-07 19:02             ` Paul E. McKenney
2019-01-07 19:02             ` Paul E. McKenney
2019-01-07 19:13             ` Michael S. Tsirkin
2019-01-07 19:13             ` Michael S. Tsirkin
2019-01-07 19:13               ` Michael S. Tsirkin
2019-01-07 19:13               ` Michael S. Tsirkin
2019-01-07 19:25               ` Paul E. McKenney
2019-01-07 19:25                 ` Paul E. McKenney
2019-01-07 19:25                 ` Paul E. McKenney
2019-01-02 20:57 ` Michael S. Tsirkin
2019-01-02 20:58 ` [PATCH RFC 4/4] virtio: use dependent_ptr_mb Michael S. Tsirkin
2019-01-02 20:58 ` Michael S. Tsirkin
2019-01-02 21:36 ` [PATCH RFC 0/4] barriers using data dependency Alan Stern
2019-01-02 21:36   ` Alan Stern
2019-01-02 23:04   ` Michael S. Tsirkin
2019-01-02 23:04     ` Michael S. Tsirkin
2019-01-03 15:11     ` Alan Stern
2019-01-03 15:11       ` Alan Stern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.