All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] bitops: add _local bitops
@ 2012-05-09 13:45 Michael S. Tsirkin
  2012-05-09 14:03 ` H. Peter Anvin
                   ` (4 more replies)
  0 siblings, 5 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 13:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, Michael S. Tsirkin, David Howells, Akinobu Mita,
	Alexey Dobriyan, Herbert Xu, Stephen Rothwell, linux-doc,
	linux-kernel, linux-arch, Gleb Natapov, Paolo Bonzini, kvm,
	Avi Kivity, Marcelo Tosatti, Linus Torvalds

kvm needs to update some hypervisor variables atomically
in a sense that the operation can't be interrupted
in the middle. However the hypervisor always runs
on the same CPU so it does not need any memory
barrier or lock prefix.

At Peter Anvin's suggestion, add _local bitops for this purpose:
define them as non-atomics for x86 and (for now) atomics
for everyone else.

Uses are not restricted to virtualization: they
might be useful to communicate with an interrupt
handler if we know that it's running on the same CPU.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---

Link to previous discussion:
http://www.spinics.net/lists/kvm/msg72241.html


 Documentation/atomic_ops.txt              |   19 ++++++
 arch/x86/include/asm/bitops.h             |    1 +
 include/asm-generic/bitops.h              |    1 +
 include/asm-generic/bitops/local-atomic.h |   92 +++++++++++++++++++++++++++++
 include/asm-generic/bitops/local.h        |   85 ++++++++++++++++++++++++++
 include/linux/bitops.h                    |    8 +++
 6 files changed, 206 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-generic/bitops/local-atomic.h
 create mode 100644 include/asm-generic/bitops/local.h

diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
index 27f2b21..b7e3b67 100644
--- a/Documentation/atomic_ops.txt
+++ b/Documentation/atomic_ops.txt
@@ -520,6 +520,25 @@ The __clear_bit_unlock version is non-atomic, however it still implements
 unlock barrier semantics. This can be useful if the lock itself is protecting
 the other bits in the word.
 
+Local versions of the bitmask operations are also provided.  They are used in
+contexts where the operations need to be performed atomically with respect to
+the local CPU, but no other CPU accesses the bitmask.  This assumption makes it
+possible to avoid the need for SMP protection and use less expensive atomic
+operations in the implementation.
+They have names similar to the above bitmask operation interfaces,
+except that _local is sufficed to the interface name.
+
+	void set_bit_local(unsigned long nr, volatile unsigned long *addr);
+	void clear_bit_local(unsigned long nr, volatile unsigned long *addr);
+	void change_bit_local(unsigned long nr, volatile unsigned long *addr);
+	int test_and_set_bit_local(unsigned long nr, volatile unsigned long *addr);
+	int test_and_clear_bit_local(unsigned long nr, volatile unsigned long *addr);
+	int test_and_change_bit_local(unsigned long nr, volatile unsigned long *addr);
+
+These local variants are useful for example if the bitmask may be accessed from
+a local intrerrupt, or from a hypervisor on the same CPU if running in a VM.
+These local variants also do not have any special memory barrier semantics.
+
 Finally, there are non-atomic versions of the bitmask operations
 provided.  They are used in contexts where some other higher-level SMP
 locking scheme is being used to protect the bitmask, and thus less
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index b97596e..8784cd7 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -509,6 +509,7 @@ static __always_inline int fls64(__u64 x)
 #include <asm-generic/bitops/le.h>
 
 #include <asm-generic/bitops/ext2-atomic-setbit.h>
+#include <asm-generic/bitops/local.h>
 
 #endif /* __KERNEL__ */
 #endif /* _ASM_X86_BITOPS_H */
diff --git a/include/asm-generic/bitops.h b/include/asm-generic/bitops.h
index 280ca7a..d720c9e 100644
--- a/include/asm-generic/bitops.h
+++ b/include/asm-generic/bitops.h
@@ -40,5 +40,6 @@
 #include <asm-generic/bitops/non-atomic.h>
 #include <asm-generic/bitops/le.h>
 #include <asm-generic/bitops/ext2-atomic.h>
+#include <asm-generic/bitops/local-atomic.h>
 
 #endif /* __ASM_GENERIC_BITOPS_H */
diff --git a/include/asm-generic/bitops/local-atomic.h b/include/asm-generic/bitops/local-atomic.h
new file mode 100644
index 0000000..94ad261
--- /dev/null
+++ b/include/asm-generic/bitops/local-atomic.h
@@ -0,0 +1,92 @@
+#ifndef ASM_GENERIC_BITOPS_LOCAL_ATOMIC_H
+#define ASM_GENERIC_BITOPS_LOCAL_ATOMIC_H
+/**
+ * Local atomic operations
+ *
+ * These operations give no atomicity or ordering guarantees if result
+ * observed from another CPU.  Atomicity is guaranteed if result is observed
+ * from the same CPU, e.g. from a local interrupt, or a hypervisor if running
+ * in a VM.
+ * Atomicity is not guaranteed across CPUs: if two examples of these operations
+ * race on different CPUs, one can appear to succeed but actually fail.  Use
+ * non-local atomics instead or protect such SMP accesses with a lock.
+ * These operations can be reordered. No memory barrier is implied.
+ */
+
+/**
+ * Implement local operations in terms of atomics.
+ * For use from a local interrupt, this is always safe but suboptimal: many
+ * architectures can use bitops/local.h instead.
+ * For use from a hypervisor, make sure your architecture doesn't
+ * rely on locks for atomics: if it does - override these operations.
+ */
+
+#define HAVE_ASM_BITOPS_LOCAL
+
+/**
+ * set_bit_local - Sets a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+#define set_bit_local(nr, addr) set_bit(nr, addr)
+
+/**
+ * clear_bit_local - Clears a bit in memory
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+#define clear_bit_local(nr, addr) clear_bit(nr, addr)
+
+/**
+ * test_and_set_bit_local - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define test_and_set_bit_local(nr, addr) test_and_set_bit(nr, addr)
+
+/**
+ * test_and_clear_bit_local - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ * Note that @nr may be almost arbitrarily large; this function is not
+ * restricted to acting on a single-word quantity.
+ */
+#define test_and_clear_bit_local(nr, addr) test_and_clear_bit(nr, addr)
+
+/**
+ * change_bit_local - Toggle a bit in memory
+ * @nr: Bit to change
+ * @addr: Address to start counting from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define change_bit_local(nr, addr) change_bit(nr, addr)
+
+/**
+ * test_and_change_bit_local - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define test_and_change_bit_local(nr, addr) test_and_change_bit(nr, addr)
+
+#endif /* ASM_GENERIC_BITOPS_LOCAL_ATOMIC_H */
diff --git a/include/asm-generic/bitops/local.h b/include/asm-generic/bitops/local.h
new file mode 100644
index 0000000..64ab358
--- /dev/null
+++ b/include/asm-generic/bitops/local.h
@@ -0,0 +1,85 @@
+#ifndef ASM_GENERIC_BITOPS_LOCAL_ATOMIC_H
+#define ASM_GENERIC_BITOPS_LOCAL_ATOMIC_H
+/**
+ * Local atomic operations
+ *
+ * These operations give no atomicity or ordering guarantees if result
+ * observed from another CPU.  Atomicity is guaranteed if result is observed
+ * from the same CPU, e.g. from a local interrupt, or a hypervisor if running
+ * in a VM.
+ * Atomicity is not guaranteed across CPUs: if two examples of these operations
+ * race on different CPUs, one can appear to succeed but actually fail.  Use
+ * non-local atomics instead or protect such SMP accesses with a lock.
+ * These operations can be reordered. No memory barrier is implied.
+ */
+
+
+/**
+ * Implement local operations in terms of non-atomics.
+ * Only safe on architectures, such as x86, that implement
+ * non-atomics in terms of a single instruction.
+ */
+
+#define HAVE_ASM_BITOPS_LOCAL
+
+/**
+ * set_bit - Atomically set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define set_bit_local(nr, addr) set_bit(nr, addr)
+
+/**
+ * clear_bit - Clears a bit in memory
+ * @nr: Bit to clear
+ * @addr: Address to start counting from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define clear_bit_local(nr, addr) clear_bit(nr, addr)
+
+/**
+ * test_and_set_bit_local - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define test_and_set_bit_local(nr, addr) test_and_set_bit(nr, addr)
+
+/**
+ * test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define test_and_clear_bit_local(nr, addr) test_and_clear_bit(nr, addr)
+
+/**
+ * change_bit - Toggle a bit in memory
+ * @nr: Bit to change
+ * @addr: Address to start counting from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define change_bit_local(nr, addr) change_bit(nr, addr)
+
+/**
+ * test_and_change_bit - Change a bit and return its old value
+ * @nr: Bit to change
+ * @addr: Address to count from
+ *
+ * This operation is atomic with respect to local CPU only. No memory barrier
+ * is implied.
+ */
+#define test_and_change_bit_local(nr, addr) test_and_change_bit(nr, addr)
+
+#endif /* ASM_GENERIC_BITOPS_LOCAL_ATOMIC_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index a3b6b82..ba86418 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -197,5 +197,13 @@ extern unsigned long find_last_bit(const unsigned long *addr,
 				   unsigned long size);
 #endif
 
+/**
+ * Include the generic version for local atomics unless
+ * an architecture overrides it.
+ * */
+#ifndef HAVE_ASM_BITOPS_LOCAL
+#include "include/asm-generic/bitops/local-atomic.h"
+#endif
+
 #endif /* __KERNEL__ */
 #endif
-- 
MST

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 13:45 [PATCH] bitops: add _local bitops Michael S. Tsirkin
@ 2012-05-09 14:03 ` H. Peter Anvin
  2012-05-09 15:06   ` Michael S. Tsirkin
                     ` (2 more replies)
  2012-05-09 14:06 ` Arnd Bergmann
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 20+ messages in thread
From: H. Peter Anvin @ 2012-05-09 14:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 06:45 AM, Michael S. Tsirkin wrote:
> kvm needs to update some hypervisor variables atomically
> in a sense that the operation can't be interrupted
> in the middle. However the hypervisor always runs
> on the same CPU so it does not need any memory
> barrier or lock prefix.
> 
> At Peter Anvin's suggestion, add _local bitops for this purpose:
> define them as non-atomics for x86 and (for now) atomics
> for everyone else.
> 
> Uses are not restricted to virtualization: they
> might be useful to communicate with an interrupt
> handler if we know that it's running on the same CPU.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

I don't think you can use the x86 nonatomics as-is, because they don't
contain optimization barriers.

	-hpa


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 13:45 [PATCH] bitops: add _local bitops Michael S. Tsirkin
  2012-05-09 14:03 ` H. Peter Anvin
@ 2012-05-09 14:06 ` Arnd Bergmann
  2012-05-09 14:17 ` Avi Kivity
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 20+ messages in thread
From: Arnd Bergmann @ 2012-05-09 14:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: H. Peter Anvin, Rob Landley, Thomas Gleixner, Ingo Molnar, x86,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wednesday 09 May 2012, Michael S. Tsirkin wrote:
>  Documentation/atomic_ops.txt              |   19 ++++++
>  arch/x86/include/asm/bitops.h             |    1 +
>  include/asm-generic/bitops.h              |    1 +
>  include/asm-generic/bitops/local-atomic.h |   92 +++++++++++++++++++++++++++++
>  include/asm-generic/bitops/local.h        |   85 ++++++++++++++++++++++++++
>  include/linux/bitops.h                    |    8 +++
>  6 files changed, 206 insertions(+), 0 deletions(-)

Unless I'm misreading the patch, you have two versions of the same file here,
where one version should be enough.

Both versions look fine to me though, so if you remove one of them:

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 13:45 [PATCH] bitops: add _local bitops Michael S. Tsirkin
  2012-05-09 14:03 ` H. Peter Anvin
  2012-05-09 14:06 ` Arnd Bergmann
@ 2012-05-09 14:17 ` Avi Kivity
  2012-05-09 19:19 ` Andrew Morton
  2012-05-10 17:38 ` Rob Landley
  4 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2012-05-09 14:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: H. Peter Anvin, Rob Landley, Thomas Gleixner, Ingo Molnar, x86,
	Arnd Bergmann, Andrew Morton, David Howells, Akinobu Mita,
	Alexey Dobriyan, Herbert Xu, Stephen Rothwell, linux-doc,
	linux-kernel, linux-arch, Gleb Natapov, Paolo Bonzini, kvm,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 04:45 PM, Michael S. Tsirkin wrote:
>  
> +Local versions of the bitmask operations are also provided.  They are used in
> +contexts where the operations need to be performed atomically with respect to
> +the local CPU, but no other CPU accesses the bitmask.  This assumption makes it
> +possible to avoid the need for SMP protection and use less expensive atomic
> +operations in the implementation.
> +They have names similar to the above bitmask operation interfaces,
> +except that _local is sufficed to the interface name.

suffixed (better: appended)

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 14:03 ` H. Peter Anvin
@ 2012-05-09 15:06   ` Michael S. Tsirkin
  2012-05-09 15:43   ` Michael S. Tsirkin
  2012-05-09 15:47   ` Michael S. Tsirkin
  2 siblings, 0 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 15:06 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 07:03:37AM -0700, H. Peter Anvin wrote:
> On 05/09/2012 06:45 AM, Michael S. Tsirkin wrote:
> > kvm needs to update some hypervisor variables atomically
> > in a sense that the operation can't be interrupted
> > in the middle. However the hypervisor always runs
> > on the same CPU so it does not need any memory
> > barrier or lock prefix.
> > 
> > At Peter Anvin's suggestion, add _local bitops for this purpose:
> > define them as non-atomics for x86 and (for now) atomics
> > for everyone else.
> > 
> > Uses are not restricted to virtualization: they
> > might be useful to communicate with an interrupt
> > handler if we know that it's running on the same CPU.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> I don't think you can use the x86 nonatomics as-is, because they don't
> contain optimization barriers.
> 
> 	-hpa

You are right of course. So I'll remove bitops/local.h
move the code to x86/ and open-code.

-- 
MST

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 14:03 ` H. Peter Anvin
  2012-05-09 15:06   ` Michael S. Tsirkin
@ 2012-05-09 15:43   ` Michael S. Tsirkin
  2012-05-09 15:44     ` H. Peter Anvin
  2012-05-09 15:47   ` Michael S. Tsirkin
  2 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 15:43 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 07:03:37AM -0700, H. Peter Anvin wrote:
> On 05/09/2012 06:45 AM, Michael S. Tsirkin wrote:
> > kvm needs to update some hypervisor variables atomically
> > in a sense that the operation can't be interrupted
> > in the middle. However the hypervisor always runs
> > on the same CPU so it does not need any memory
> > barrier or lock prefix.
> > 
> > At Peter Anvin's suggestion, add _local bitops for this purpose:
> > define them as non-atomics for x86 and (for now) atomics
> > for everyone else.
> > 
> > Uses are not restricted to virtualization: they
> > might be useful to communicate with an interrupt
> > handler if we know that it's running on the same CPU.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> I don't think you can use the x86 nonatomics as-is, because they don't
> contain optimization barriers.
> 
> 	-hpa

Just adding a memory clobber to asm will be enough, right?

-- 
MST

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 15:43   ` Michael S. Tsirkin
@ 2012-05-09 15:44     ` H. Peter Anvin
  0 siblings, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2012-05-09 15:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 08:43 AM, Michael S. Tsirkin wrote:
> 
> Just adding a memory clobber to asm will be enough, right?
> 

Yes.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 14:03 ` H. Peter Anvin
  2012-05-09 15:06   ` Michael S. Tsirkin
  2012-05-09 15:43   ` Michael S. Tsirkin
@ 2012-05-09 15:47   ` Michael S. Tsirkin
  2012-05-09 16:24     ` H. Peter Anvin
  2 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 15:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 07:03:37AM -0700, H. Peter Anvin wrote:
> On 05/09/2012 06:45 AM, Michael S. Tsirkin wrote:
> > kvm needs to update some hypervisor variables atomically
> > in a sense that the operation can't be interrupted
> > in the middle. However the hypervisor always runs
> > on the same CPU so it does not need any memory
> > barrier or lock prefix.
> > 
> > At Peter Anvin's suggestion, add _local bitops for this purpose:
> > define them as non-atomics for x86 and (for now) atomics
> > for everyone else.
> > 
> > Uses are not restricted to virtualization: they
> > might be useful to communicate with an interrupt
> > handler if we know that it's running on the same CPU.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> I don't think you can use the x86 nonatomics as-is, because they don't
> contain optimization barriers.
> 
> 	-hpa

By the way, clear_bit on x86 does not seem to contain
an optimization barrier - is my reading correct?
Lock prefix does not affect the compiler, right?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 15:47   ` Michael S. Tsirkin
@ 2012-05-09 16:24     ` H. Peter Anvin
  2012-05-09 16:36       ` Michael S. Tsirkin
  0 siblings, 1 reply; 20+ messages in thread
From: H. Peter Anvin @ 2012-05-09 16:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 08:47 AM, Michael S. Tsirkin wrote:
> 
> By the way, clear_bit on x86 does not seem to contain
> an optimization barrier - is my reading correct?
> Lock prefix does not affect the compiler, right?

Yes, as it clearly states in the comment:

 * clear_bit() is atomic and may not be reordered.  However, it does
 * not contain a memory barrier, so if it is used for locking purposes,
 * you should call smp_mb__before_clear_bit() and/or
smp_mb__after_clear_bit()
 * in order to ensure changes are visible on other processors.

There is clear_bit_unlock() which has the barrier semantics.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 16:24     ` H. Peter Anvin
@ 2012-05-09 16:36       ` Michael S. Tsirkin
  2012-05-09 16:45         ` H. Peter Anvin
  0 siblings, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 16:36 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 09:24:41AM -0700, H. Peter Anvin wrote:
> On 05/09/2012 08:47 AM, Michael S. Tsirkin wrote:
> > 
> > By the way, clear_bit on x86 does not seem to contain
> > an optimization barrier - is my reading correct?
> > Lock prefix does not affect the compiler, right?
> 
> Yes, as it clearly states in the comment:
> 
>  * clear_bit() is atomic and may not be reordered.  However, it does
>  * not contain a memory barrier, so if it is used for locking purposes,
>  * you should call smp_mb__before_clear_bit() and/or
> smp_mb__after_clear_bit()
>  * in order to ensure changes are visible on other processors.
> 
> There is clear_bit_unlock() which has the barrier semantics.
> 
> 	-hpa

Well it talks about a memory barrier, not an
optimization barrier.

If compiler reorders code, changes will appear in
the wrong order on the current processor,
not just on other processors, no?

Sorry if I'm confused about this point, this is
what Documentation/atomic_ops.txt made me believe:
<quote>
	For example consider the following code:

		while (a > 0)
			do_something();

	If the compiler can prove that do_something() does not store to the
	variable a, then the compiler is within its rights transforming this to
	the following:

		tmp = a;
		if (a > 0)
			for (;;)
				do_something();

</quote>
> 
> -- 
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 16:36       ` Michael S. Tsirkin
@ 2012-05-09 16:45         ` H. Peter Anvin
  2012-05-09 16:55           ` Michael S. Tsirkin
  0 siblings, 1 reply; 20+ messages in thread
From: H. Peter Anvin @ 2012-05-09 16:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 09:36 AM, Michael S. Tsirkin wrote:
> 
> Well it talks about a memory barrier, not an
> optimization barrier.
> 

Same thing.

> If compiler reorders code, changes will appear in
> the wrong order on the current processor,
> not just on other processors, no?

Yes.

For your _local I would just copy the atomic bitops but remote the locks
in most cases.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 16:45         ` H. Peter Anvin
@ 2012-05-09 16:55           ` Michael S. Tsirkin
  0 siblings, 0 replies; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 16:55 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rob Landley, Thomas Gleixner, Ingo Molnar, x86, Arnd Bergmann,
	Andrew Morton, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 09:45:57AM -0700, H. Peter Anvin wrote:
> On 05/09/2012 09:36 AM, Michael S. Tsirkin wrote:
> > 
> > Well it talks about a memory barrier, not an
> > optimization barrier.
> > 
> 
> Same thing.

I see. So it really should say 'any barrier', right?
Documentation/atomic_ops.txt goes to great length
to distinguish between the two and we probably
should not confuse things.

> > If compiler reorders code, changes will appear in
> > the wrong order on the current processor,
> > not just on other processors, no?
> 
> Yes.

So this seems to contradict what the comment says:

	clear_bit() is atomic and may not be reordered.
and you say compiler *can* reorder it, and below

 you should call smp_mb__before_clear_bit() and/or * smp_mb__after_clear_bit()
 in order to ensure changes are visible on other processors.

and in fact this is not enough, you also need to call
barrier() to ensure changes are visible on the same
processor in the correct order.

> For your _local I would just copy the atomic bitops but remote the locks
> in most cases.
> 
> 	-hpa

Right, I sent v2 that does exactly this.

My question about documentation for change_bit
is an unrelated one: to me, it looks like the documentation for
change_bit does not match the implementation, or at least is somewhat
confusing.

> -- 
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel.  I don't speak on their behalf.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 13:45 [PATCH] bitops: add _local bitops Michael S. Tsirkin
                   ` (2 preceding siblings ...)
  2012-05-09 14:17 ` Avi Kivity
@ 2012-05-09 19:19 ` Andrew Morton
  2012-05-09 19:23   ` H. Peter Anvin
  2012-05-09 20:07   ` Michael S. Tsirkin
  2012-05-10 17:38 ` Rob Landley
  4 siblings, 2 replies; 20+ messages in thread
From: Andrew Morton @ 2012-05-09 19:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: H. Peter Anvin, Rob Landley, Thomas Gleixner, Ingo Molnar, x86,
	Arnd Bergmann, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, 9 May 2012 16:45:29 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> kvm needs to update some hypervisor variables atomically
> in a sense that the operation can't be interrupted
> in the middle. However the hypervisor always runs
> on the same CPU so it does not need any memory
> barrier or lock prefix.

Well.  It adds more complexity, makes the kernel harder to understand
and maintain and introduces more opportunities for developers to add
bugs.  So from that point of view, the best way of handling this patch
is to delete it.

Presumably the patch offers some benefit to offest all those costs. 
But you didn't tell us what that benefit is, so we cannot make
a decision.

IOW: numbers, please.  Convincing ones, for realistic test cases.


Secondly: can KVM just use __set_bit() and friends?  I suspect those
interfaces happen to meet your requirements.  At least on architectures
you care about.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 19:19 ` Andrew Morton
@ 2012-05-09 19:23   ` H. Peter Anvin
  2012-05-09 20:07   ` Michael S. Tsirkin
  1 sibling, 0 replies; 20+ messages in thread
From: H. Peter Anvin @ 2012-05-09 19:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michael S. Tsirkin, Rob Landley, Thomas Gleixner, Ingo Molnar,
	x86, Arnd Bergmann, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 12:19 PM, Andrew Morton wrote:
> 
> Secondly: can KVM just use __set_bit() and friends?  I suspect those
> interfaces happen to meet your requirements.  At least on architectures
> you care about.
> 

__set_bit() and friends right now don't have optimization barriers,
meaning the compiler is free to move them around.

	-hpa

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 19:19 ` Andrew Morton
  2012-05-09 19:23   ` H. Peter Anvin
@ 2012-05-09 20:07   ` Michael S. Tsirkin
  2012-05-09 20:10     ` H. Peter Anvin
  1 sibling, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 20:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: H. Peter Anvin, Rob Landley, Thomas Gleixner, Ingo Molnar, x86,
	Arnd Bergmann, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 12:19:40PM -0700, Andrew Morton wrote:
> On Wed, 9 May 2012 16:45:29 +0300
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > kvm needs to update some hypervisor variables atomically
> > in a sense that the operation can't be interrupted
> > in the middle. However the hypervisor always runs
> > on the same CPU so it does not need any memory
> > barrier or lock prefix.
> 
> Well.  It adds more complexity, makes the kernel harder to understand
> and maintain and introduces more opportunities for developers to add
> bugs.  So from that point of view, the best way of handling this patch
> is to delete it.
> 
> Presumably the patch offers some benefit to offest all those costs. 
> But you didn't tell us what that benefit is, so we cannot make
> a decision.
> 
> IOW: numbers, please.  Convincing ones, for realistic test cases.

I can try but in practice it's not an optimization.
What kvm needs is a guarantee that a memory update will be done in a
single instruction.

> Secondly: can KVM just use __set_bit() and friends?  I suspect those
> interfaces happen to meet your requirements.  At least on architectures
> you care about.

Sigh. I started with this, but then Avi Kivity said that he's worried
about someone changing __test_and_clear_bit on x86
such that the change won't be atomic (single instruction) anymore.
So I put inline asm into kvm.c. This drew comment from Peter
that maintaining separate asm code in kvm.c adds too much
maintainance overhead so I should implement _local, add
asm-generic fallback and put it all in generic code.

In practice ATM any of the above will work. We probably don't even need
to add barrier() calls since what we do afterwards is apic access which
has an optimization barrier anyway.  But I'm fine with adding them in
there just in case if that's what people want.

However, since we've come full circle, I'd like to have a discussion
on what, exactly, is acceptable to all maintainers.
Avi, Andrew, Peter, could you please comment?

-- 
MST

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 20:07   ` Michael S. Tsirkin
@ 2012-05-09 20:10     ` H. Peter Anvin
  2012-05-09 20:12       ` Michael S. Tsirkin
  2012-05-10 23:02       ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 20+ messages in thread
From: H. Peter Anvin @ 2012-05-09 20:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Andrew Morton, Rob Landley, Thomas Gleixner, Ingo Molnar, x86,
	Arnd Bergmann, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 01:07 PM, Michael S. Tsirkin wrote:
> 
> In practice ATM any of the above will work. We probably don't even need
> to add barrier() calls since what we do afterwards is apic access which
> has an optimization barrier anyway.  But I'm fine with adding them in
> there just in case if that's what people want.
> 

If you have the optimization barrier anyway, then I'd be fine with you
just using __test_and_clear_bit() and add to a comment in
arch/x86/include/asm/bitops.h that KVM needs it to be locally atomic.

	-hpa

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 20:10     ` H. Peter Anvin
@ 2012-05-09 20:12       ` Michael S. Tsirkin
  2012-05-10  9:26         ` Avi Kivity
  2012-05-10 23:02       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 20+ messages in thread
From: Michael S. Tsirkin @ 2012-05-09 20:12 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andrew Morton, Rob Landley, Thomas Gleixner, Ingo Molnar, x86,
	Arnd Bergmann, David Howells, Akinobu Mita, Alexey Dobriyan,
	Herbert Xu, Stephen Rothwell, linux-doc, linux-kernel,
	linux-arch, Gleb Natapov, Paolo Bonzini, kvm, Avi Kivity,
	Marcelo Tosatti, Linus Torvalds

On Wed, May 09, 2012 at 01:10:04PM -0700, H. Peter Anvin wrote:
> On 05/09/2012 01:07 PM, Michael S. Tsirkin wrote:
> > 
> > In practice ATM any of the above will work. We probably don't even need
> > to add barrier() calls since what we do afterwards is apic access which
> > has an optimization barrier anyway.  But I'm fine with adding them in
> > there just in case if that's what people want.
> > 
> 
> If you have the optimization barrier anyway, then I'd be fine with you
> just using __test_and_clear_bit() and add to a comment in
> arch/x86/include/asm/bitops.h that KVM needs it to be locally atomic.
> 
> 	-hpa

Sounds good. Avi?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 20:12       ` Michael S. Tsirkin
@ 2012-05-10  9:26         ` Avi Kivity
  0 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2012-05-10  9:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: H. Peter Anvin, Andrew Morton, Rob Landley, Thomas Gleixner,
	Ingo Molnar, x86, Arnd Bergmann, David Howells, Akinobu Mita,
	Alexey Dobriyan, Herbert Xu, Stephen Rothwell, linux-doc,
	linux-kernel, linux-arch, Gleb Natapov, Paolo Bonzini, kvm,
	Marcelo Tosatti, Linus Torvalds

On 05/09/2012 11:12 PM, Michael S. Tsirkin wrote:
> On Wed, May 09, 2012 at 01:10:04PM -0700, H. Peter Anvin wrote:
> > On 05/09/2012 01:07 PM, Michael S. Tsirkin wrote:
> > > 
> > > In practice ATM any of the above will work. We probably don't even need
> > > to add barrier() calls since what we do afterwards is apic access which
> > > has an optimization barrier anyway.  But I'm fine with adding them in
> > > there just in case if that's what people want.
> > > 
> > 
> > If you have the optimization barrier anyway, then I'd be fine with you
> > just using __test_and_clear_bit() and add to a comment in
> > arch/x86/include/asm/bitops.h that KVM needs it to be locally atomic.
> > 
> > 	-hpa
>
> Sounds good. Avi?

Okay.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 13:45 [PATCH] bitops: add _local bitops Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2012-05-09 19:19 ` Andrew Morton
@ 2012-05-10 17:38 ` Rob Landley
  4 siblings, 0 replies; 20+ messages in thread
From: Rob Landley @ 2012-05-10 17:38 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: linux-doc, linux-kernel, linux-arch, kvm

On 05/09/2012 08:45 AM, Michael S. Tsirkin wrote:
> diff --git a/Documentation/atomic_ops.txt b/Documentation/atomic_ops.txt
> index 27f2b21..b7e3b67 100644
> --- a/Documentation/atomic_ops.txt
> +++ b/Documentation/atomic_ops.txt
> @@ -520,6 +520,25 @@ The __clear_bit_unlock version is non-atomic, however it still implements
>  unlock barrier semantics. This can be useful if the lock itself is protecting
>  the other bits in the word.
>  
> +Local versions of the bitmask operations are also provided.  They are used in
> +contexts where the operations need to be performed atomically with respect to
> +the local CPU, but no other CPU accesses the bitmask.  This assumption makes it
> +possible to avoid the need for SMP protection and use less expensive atomic
> +operations in the implementation.
> +They have names similar to the above bitmask operation interfaces,
> +except that _local is sufficed to the interface name.
> +
> +	void set_bit_local(unsigned long nr, volatile unsigned long *addr);
> +	void clear_bit_local(unsigned long nr, volatile unsigned long *addr);
> +	void change_bit_local(unsigned long nr, volatile unsigned long *addr);
> +	int test_and_set_bit_local(unsigned long nr, volatile unsigned long *addr);
> +	int test_and_clear_bit_local(unsigned long nr, volatile unsigned long *addr);
> +	int test_and_change_bit_local(unsigned long nr, volatile unsigned long *addr);
> +
> +These local variants are useful for example if the bitmask may be accessed from
> +a local intrerrupt, or from a hypervisor on the same CPU if running in a VM.
> +These local variants also do not have any special memory barrier semantics.
> +
>  Finally, there are non-atomic versions of the bitmask operations
>  provided.  They are used in contexts where some other higher-level SMP
>  locking scheme is being used to protect the bitmask, and thus less

For this bit:

Acked-by: Rob Landley <rob@landley.net>

Rob
-- 
GNU/Linux isn't: Linux=GPLv2, GNU=GPLv3+, they can't share code.
Either it's "mere aggregation", or a license violation.  Pick one.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH] bitops: add _local bitops
  2012-05-09 20:10     ` H. Peter Anvin
  2012-05-09 20:12       ` Michael S. Tsirkin
@ 2012-05-10 23:02       ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 20+ messages in thread
From: Benjamin Herrenschmidt @ 2012-05-10 23:02 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Michael S. Tsirkin, Andrew Morton, Rob Landley, Thomas Gleixner,
	Ingo Molnar, x86, Arnd Bergmann, David Howells, Akinobu Mita,
	Alexey Dobriyan, Herbert Xu, Stephen Rothwell, linux-doc,
	linux-kernel, linux-arch, Gleb Natapov, Paolo Bonzini, kvm,
	Avi Kivity, Marcelo Tosatti, Linus Torvalds

On Wed, 2012-05-09 at 13:10 -0700, H. Peter Anvin wrote:
> On 05/09/2012 01:07 PM, Michael S. Tsirkin wrote:
> > 
> > In practice ATM any of the above will work. We probably don't even need
> > to add barrier() calls since what we do afterwards is apic access which
> > has an optimization barrier anyway.  But I'm fine with adding them in
> > there just in case if that's what people want.
> > 
> 
> If you have the optimization barrier anyway, then I'd be fine with you
> just using __test_and_clear_bit() and add to a comment in
> arch/x86/include/asm/bitops.h that KVM needs it to be locally atomic.
> 
> 	
What is it used for ? IE. Is this strictly a requirement of x86 KVM ?

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-05-10 23:03 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-09 13:45 [PATCH] bitops: add _local bitops Michael S. Tsirkin
2012-05-09 14:03 ` H. Peter Anvin
2012-05-09 15:06   ` Michael S. Tsirkin
2012-05-09 15:43   ` Michael S. Tsirkin
2012-05-09 15:44     ` H. Peter Anvin
2012-05-09 15:47   ` Michael S. Tsirkin
2012-05-09 16:24     ` H. Peter Anvin
2012-05-09 16:36       ` Michael S. Tsirkin
2012-05-09 16:45         ` H. Peter Anvin
2012-05-09 16:55           ` Michael S. Tsirkin
2012-05-09 14:06 ` Arnd Bergmann
2012-05-09 14:17 ` Avi Kivity
2012-05-09 19:19 ` Andrew Morton
2012-05-09 19:23   ` H. Peter Anvin
2012-05-09 20:07   ` Michael S. Tsirkin
2012-05-09 20:10     ` H. Peter Anvin
2012-05-09 20:12       ` Michael S. Tsirkin
2012-05-10  9:26         ` Avi Kivity
2012-05-10 23:02       ` Benjamin Herrenschmidt
2012-05-10 17:38 ` Rob Landley

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.