linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses
@ 2014-12-03 22:30 Christian Borntraeger
  2014-12-03 22:30 ` [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE Christian Borntraeger
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

As discussed on LKML http://marc.info/?i=54611D86.4040306%40de.ibm.com
ACCESS_ONCE might fail with specific compiler for non-scalar accesses.

Here is a set of patches to tackle that problem.

The first patch introduce READ_ONCE and ASSIGN_ONCE. If the data structure
is larger than the machine word size memcpy is used and a warning is emitted.
The next patches fix up all in-tree users of ACCESS_ONCE on non-scalar types.
The last patch forces ACCESS_ONCE to work only on scalar types. 

I have cross-compiled the resulting kernel with defconfig and gcc 4.9 for
microblaze, m68k, alpha, s390,x86_64, i686, sparc, sparc64, mips,
ia64, arm and arm64.

Runtime tested on s390x and x86_64. I have also verified that ASSIGN_ONCE works
as expected with some test changes as there are no user in this patch series.

Linus, ok for the next merge window?

Christian Borntraeger (9):
  kernel: Provide READ_ONCE and ASSIGN_ONCE
  mm: replace ACCESS_ONCE with READ_ONCE or barriers
  x86/spinlock: Replace ACCESS_ONCE with READ_ONCE
  x86/gup: Replace ACCESS_ONCE with READ_ONCE
  mips/gup: Replace ACCESS_ONCE with READ_ONCE
  arm64/spinlock: Replace ACCESS_ONCE READ_ONCE
  arm/spinlock: Replace ACCESS_ONCE with READ_ONCE
  s390/kvm: REPLACE ACCESS_ONCE with READ_ONCE
  kernel: tighten rules for ACCESS ONCE

 arch/arm/include/asm/spinlock.h   |  4 +--
 arch/arm64/include/asm/spinlock.h |  4 +--
 arch/mips/mm/gup.c                |  2 +-
 arch/s390/kvm/gaccess.c           | 14 ++++----
 arch/x86/include/asm/spinlock.h   |  8 ++---
 arch/x86/mm/gup.c                 |  2 +-
 include/linux/compiler.h          | 74 ++++++++++++++++++++++++++++++++++++++-
 mm/gup.c                          |  2 +-
 mm/memory.c                       |  2 +-
 mm/rmap.c                         |  3 +-
 10 files changed, 95 insertions(+), 20 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:07   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 2/9] mm: replace ACCESS_ONCE with READ_ONCE or barriers Christian Borntraeger
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
scalar types as suggested by Linus Torvalds. Accesses larger than
the machines word size cannot be guaranteed to be atomic. These
macros will use memcpy and emit a build warning.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 include/linux/compiler.h | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index d5ad7b1..947710e 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -186,6 +186,70 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
 # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
 #endif
 
+#include <linux/types.h>
+
+void data_access_exceeds_word_size(void)
+__compiletime_warning("data access exceeds word size and won't be atomic");
+
+static __always_inline void __read_once_size(volatile void *p, void *res, int size)
+{
+	switch (size) {
+	case 1: *(u8 *)res = *(volatile u8 *)p; break;
+	case 2: *(u16 *)res = *(volatile u16 *)p; break;
+	case 4: *(u32 *)res = *(volatile u32 *)p; break;
+#ifdef CONFIG_64BIT
+	case 8: *(u64 *)res = *(volatile u64 *)p; break;
+#endif
+	default:
+		barrier();
+		__builtin_memcpy((void *)res, (const void *)p, size);
+		data_access_exceeds_word_size();
+		barrier();
+	}
+}
+
+static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
+{
+	switch (size) {
+	case 1: *(volatile u8 *)p = *(u8 *)res; break;
+	case 2: *(volatile u16 *)p = *(u16 *)res; break;
+	case 4: *(volatile u32 *)p = *(u32 *)res; break;
+#ifdef CONFIG_64BIT
+	case 8: *(volatile u64 *)p = *(u64 *)res; break;
+#endif
+	default:
+		barrier();
+		__builtin_memcpy((void *)p, (const void *)res, size);
+		data_access_exceeds_word_size();
+		barrier();
+	}
+}
+
+/*
+ * Prevent the compiler from merging or refetching reads or writes. The compiler
+ * is also forbidden from reordering successive instances of READ_ONCE,
+ * ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the compiler is aware
+ * of some particular ordering.  One way to make the compiler aware of ordering
+ * is to put the two invocations of READ_ONCE, ASSIGN_ONCE or ACCESS_ONCE() in
+ * different C statements.
+ *
+ * In contrast to ACCESS_ONCE these two macros will also work on aggregate data
+ * types like structs or unions. If the size of the accessed data type exceeds
+ * the word size of the machine (e.g. 32bit or 64bit), the access might happen
+ * in multiple chunks, though.
+ *
+ * These macros do absolutely -nothing- to prevent the CPU from reordering,
+ * merging, or refetching absolutely anything at any time.  Their main intended
+ * use is to mediate communication between process-level code and irq/NMI
+ * handlers, all running on the same CPU.
+ */
+
+#define READ_ONCE(p) \
+      ({ typeof(p) __val; __read_once_size(&p, &__val, sizeof(__val)); __val; })
+
+#define ASSIGN_ONCE(val, p) \
+      ({ typeof(p) __val; __val = val; __assign_once_size(&p, &__val, sizeof(__val)); __val; })
+
 #endif /* __KERNEL__ */
 
 #endif /* __ASSEMBLY__ */
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/9] mm: replace ACCESS_ONCE with READ_ONCE or barriers
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
  2014-12-03 22:30 ` [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:09   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 3/9] x86/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Let's change the code to access the page table elements with
READ_ONCE that does implicit scalar accesses.

mm_find_pmd is tricky, because m68k and sparc(32bit) define pmd_t
as array of longs. This code requires just that the pmd_present
and pmd_trans_huge check are done on the same value, so a barrier
is sufficent.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 mm/gup.c    | 2 +-
 mm/memory.c | 2 +-
 mm/rmap.c   | 3 ++-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index cd62c8c..f2305de 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -917,7 +917,7 @@ static int gup_pud_range(pgd_t *pgdp, unsigned long addr, unsigned long end,
 
 	pudp = pud_offset(pgdp, addr);
 	do {
-		pud_t pud = ACCESS_ONCE(*pudp);
+		pud_t pud = READ_ONCE(*pudp);
 
 		next = pud_addr_end(addr, end);
 		if (pud_none(pud))
diff --git a/mm/memory.c b/mm/memory.c
index 3e50383..9e0c84e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3202,7 +3202,7 @@ static int handle_pte_fault(struct mm_struct *mm,
 	pte_t entry;
 	spinlock_t *ptl;
 
-	entry = ACCESS_ONCE(*pte);
+	entry = READ_ONCE(*pte);
 	if (!pte_present(entry)) {
 		if (pte_none(entry)) {
 			if (vma->vm_ops) {
diff --git a/mm/rmap.c b/mm/rmap.c
index 19886fb..1e54274 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -581,7 +581,8 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
 	 * without holding anon_vma lock for write.  So when looking for a
 	 * genuine pmde (in which to find pte), test present and !THP together.
 	 */
-	pmde = ACCESS_ONCE(*pmd);
+	pmde = *pmd;
+	barrier();
 	if (!pmd_present(pmde) || pmd_trans_huge(pmde))
 		pmd = NULL;
 out:
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/9] x86/spinlock: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
  2014-12-03 22:30 ` [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE Christian Borntraeger
  2014-12-03 22:30 ` [PATCH 2/9] mm: replace ACCESS_ONCE with READ_ONCE or barriers Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:10   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 4/9] x86/gup: " Christian Borntraeger
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/x86/include/asm/spinlock.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 9295016..12a69b4 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -92,7 +92,7 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 		unsigned count = SPIN_THRESHOLD;
 
 		do {
-			if (ACCESS_ONCE(lock->tickets.head) == inc.tail)
+			if (READ_ONCE(lock->tickets.head) == inc.tail)
 				goto out;
 			cpu_relax();
 		} while (--count);
@@ -105,7 +105,7 @@ static __always_inline int arch_spin_trylock(arch_spinlock_t *lock)
 {
 	arch_spinlock_t old, new;
 
-	old.tickets = ACCESS_ONCE(lock->tickets);
+	old.tickets = READ_ONCE(lock->tickets);
 	if (old.tickets.head != (old.tickets.tail & ~TICKET_SLOWPATH_FLAG))
 		return 0;
 
@@ -162,14 +162,14 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 
 static inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	struct __raw_tickets tmp = ACCESS_ONCE(lock->tickets);
+	struct __raw_tickets tmp = READ_ONCE(lock->tickets);
 
 	return tmp.tail != tmp.head;
 }
 
 static inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	struct __raw_tickets tmp = ACCESS_ONCE(lock->tickets);
+	struct __raw_tickets tmp = READ_ONCE(lock->tickets);
 
 	return (__ticket_t)(tmp.tail - tmp.head) > TICKET_LOCK_INC;
 }
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/9] x86/gup: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (2 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 3/9] x86/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:10   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 5/9] mips/gup: " Christian Borntraeger
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the gup code to replace ACCESS_ONCE with READ_ONCE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/x86/mm/gup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
index 207d9aef..d754782 100644
--- a/arch/x86/mm/gup.c
+++ b/arch/x86/mm/gup.c
@@ -15,7 +15,7 @@
 static inline pte_t gup_get_pte(pte_t *ptep)
 {
 #ifndef CONFIG_X86_PAE
-	return ACCESS_ONCE(*ptep);
+	return READ_ONCE(*ptep);
 #else
 	/*
 	 * With get_user_pages_fast, we walk down the pagetables without taking
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/9] mips/gup: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (3 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 4/9] x86/gup: " Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:11   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 6/9] arm64/spinlock: Replace ACCESS_ONCE READ_ONCE Christian Borntraeger
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the gup code to replace ACCESS_ONCE with READ_ONCE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/mips/mm/gup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/mm/gup.c b/arch/mips/mm/gup.c
index 06ce17c..8aa50e3 100644
--- a/arch/mips/mm/gup.c
+++ b/arch/mips/mm/gup.c
@@ -30,7 +30,7 @@ retry:
 
 	return pte;
 #else
-	return ACCESS_ONCE(*ptep);
+	return READ_ONCE(*ptep);
 #endif
 }
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/9] arm64/spinlock: Replace ACCESS_ONCE READ_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (4 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 5/9] mips/gup: " Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:11   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 7/9] arm/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/arm64/include/asm/spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index c45b7b1..cee1287 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -99,12 +99,12 @@ static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 
 static inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	return !arch_spin_value_unlocked(ACCESS_ONCE(*lock));
+	return !arch_spin_value_unlocked(READ_ONCE(*lock));
 }
 
 static inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	arch_spinlock_t lockval = ACCESS_ONCE(*lock);
+	arch_spinlock_t lockval = READ_ONCE(*lock);
 	return (lockval.next - lockval.owner) > 1;
 }
 #define arch_spin_is_contended	arch_spin_is_contended
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/9] arm/spinlock: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (5 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 6/9] arm64/spinlock: Replace ACCESS_ONCE READ_ONCE Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:12   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 8/9] s390/kvm: REPLACE " Christian Borntraeger
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/arm/include/asm/spinlock.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/spinlock.h b/arch/arm/include/asm/spinlock.h
index ac4bfae..0fa4184 100644
--- a/arch/arm/include/asm/spinlock.h
+++ b/arch/arm/include/asm/spinlock.h
@@ -120,12 +120,12 @@ static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 
 static inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	return !arch_spin_value_unlocked(ACCESS_ONCE(*lock));
+	return !arch_spin_value_unlocked(READ_ONCE(*lock));
 }
 
 static inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	struct __raw_tickets tickets = ACCESS_ONCE(lock->tickets);
+	struct __raw_tickets tickets = READ_ONCE(lock->tickets);
 	return (tickets.next - tickets.owner) > 1;
 }
 #define arch_spin_is_contended	arch_spin_is_contended
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/9] s390/kvm: REPLACE ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (6 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 7/9] arm/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:12   ` Paul E. McKenney
  2014-12-03 22:30 ` [PATCH 9/9] kernel: tighten rules for ACCESS ONCE Christian Borntraeger
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)

Change the ipte lock code to replace ACCESS_ONCE with READ_ONCE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 arch/s390/kvm/gaccess.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index 0f961a1..8f195fa 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -229,10 +229,10 @@ static void ipte_lock_simple(struct kvm_vcpu *vcpu)
 		goto out;
 	ic = &vcpu->kvm->arch.sca->ipte_control;
 	do {
-		old = ACCESS_ONCE(*ic);
+		old = READ_ONCE(*ic);
 		while (old.k) {
 			cond_resched();
-			old = ACCESS_ONCE(*ic);
+			old = READ_ONCE(*ic);
 		}
 		new = old;
 		new.k = 1;
@@ -251,7 +251,8 @@ static void ipte_unlock_simple(struct kvm_vcpu *vcpu)
 		goto out;
 	ic = &vcpu->kvm->arch.sca->ipte_control;
 	do {
-		new = old = ACCESS_ONCE(*ic);
+		old = READ_ONCE(*ic);
+		new = old;
 		new.k = 0;
 	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
 	wake_up(&vcpu->kvm->arch.ipte_wq);
@@ -265,10 +266,10 @@ static void ipte_lock_siif(struct kvm_vcpu *vcpu)
 
 	ic = &vcpu->kvm->arch.sca->ipte_control;
 	do {
-		old = ACCESS_ONCE(*ic);
+		old = READ_ONCE(*ic);
 		while (old.kg) {
 			cond_resched();
-			old = ACCESS_ONCE(*ic);
+			old = READ_ONCE(*ic);
 		}
 		new = old;
 		new.k = 1;
@@ -282,7 +283,8 @@ static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
 
 	ic = &vcpu->kvm->arch.sca->ipte_control;
 	do {
-		new = old = ACCESS_ONCE(*ic);
+		old = READ_ONCE(*ic);
+		new = old;
 		new.kh--;
 		if (!new.kh)
 			new.k = 0;
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 9/9] kernel: tighten rules for ACCESS ONCE
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (7 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 8/9] s390/kvm: REPLACE " Christian Borntraeger
@ 2014-12-03 22:30 ` Christian Borntraeger
  2014-12-04  0:16   ` Paul E. McKenney
  2014-12-04 15:24 ` [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
  2014-12-04 23:40 ` Linus Torvalds
  10 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-03 22:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds, Christian Borntraeger

Now that all non-scalar users of ACCESS_ONCE have been converted
to READ_ONCE or ASSIGN once, lets tighten ACCESS_ONCE to only
work on scalar types.
This variant was proposed by Alexei Starovoitov.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 include/linux/compiler.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 947710e..5ba91de 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -441,8 +441,16 @@ static __always_inline void __assign_once_size(volatile void *p, void *res, int
  * merging, or refetching absolutely anything at any time.  Its main intended
  * use is to mediate communication between process-level code and irq/NMI
  * handlers, all running on the same CPU.
+ *
+ * ACCESS_ONCE will only work on scalar types. For union types, ACCESS_ONCE
+ * on a union member will work as long as the size of the member matches the
+ * size of the union and the size is smaller than word size.
+ * If possible READ_ONCE/ASSIGN_ONCE should be used instead.
  */
-#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
+#define __ACCESS_ONCE(x) ({ \
+	 __maybe_unused typeof(x) __var = 0; \
+	(volatile typeof(x) *)&(x); })
+#define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
 
 /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */
 #ifdef CONFIG_KPROBES
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE
  2014-12-03 22:30 ` [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE Christian Borntraeger
@ 2014-12-04  0:07   ` Paul E. McKenney
  2014-12-04  9:24     ` Christian Borntraeger
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:07 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:13PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
> scalar types as suggested by Linus Torvalds. Accesses larger than
> the machines word size cannot be guaranteed to be atomic. These
> macros will use memcpy and emit a build warning.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

With or without some nits below addressed:

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/compiler.h | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index d5ad7b1..947710e 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -186,6 +186,70 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
>  # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
>  #endif
> 
> +#include <linux/types.h>
> +
> +void data_access_exceeds_word_size(void)
> +__compiletime_warning("data access exceeds word size and won't be atomic");
> +
> +static __always_inline void __read_once_size(volatile void *p, void *res, int size)
> +{
> +	switch (size) {
> +	case 1: *(u8 *)res = *(volatile u8 *)p; break;
> +	case 2: *(u16 *)res = *(volatile u16 *)p; break;
> +	case 4: *(u32 *)res = *(volatile u32 *)p; break;
> +#ifdef CONFIG_64BIT
> +	case 8: *(u64 *)res = *(volatile u64 *)p; break;
> +#endif

You could get rid of the #ifdef above by doing "case sizeof(long)"
and switching the casts from u64 to unsigned long.

> +	default:
> +		barrier();
> +		__builtin_memcpy((void *)res, (const void *)p, size);
> +		data_access_exceeds_word_size();
> +		barrier();
> +	}
> +}
> +
> +static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
> +{
> +	switch (size) {
> +	case 1: *(volatile u8 *)p = *(u8 *)res; break;
> +	case 2: *(volatile u16 *)p = *(u16 *)res; break;
> +	case 4: *(volatile u32 *)p = *(u32 *)res; break;
> +#ifdef CONFIG_64BIT
> +	case 8: *(volatile u64 *)p = *(u64 *)res; break;
> +#endif

Ditto.

> +	default:
> +		barrier();
> +		__builtin_memcpy((void *)p, (const void *)res, size);
> +		data_access_exceeds_word_size();
> +		barrier();
> +	}
> +}
> +
> +/*
> + * Prevent the compiler from merging or refetching reads or writes. The compiler
> + * is also forbidden from reordering successive instances of READ_ONCE,
> + * ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the compiler is aware
> + * of some particular ordering.  One way to make the compiler aware of ordering
> + * is to put the two invocations of READ_ONCE, ASSIGN_ONCE or ACCESS_ONCE() in
> + * different C statements.
> + *
> + * In contrast to ACCESS_ONCE these two macros will also work on aggregate data
> + * types like structs or unions. If the size of the accessed data type exceeds
> + * the word size of the machine (e.g. 32bit or 64bit), the access might happen
> + * in multiple chunks, though.

This last sentence might be more clear if it was something like the
following:

	If the size of the accessed data type exceeeds the word size of
	the machine (e.g., 32 bits or 64 bits), ACCESS_ONCE() might
	carry out the access in multiple chunks, but READ_ONCE() and
	ASSIGN_ONCE() will give a link-time error.

> + *
> + * These macros do absolutely -nothing- to prevent the CPU from reordering,
> + * merging, or refetching absolutely anything at any time.  Their main intended
> + * use is to mediate communication between process-level code and irq/NMI
> + * handlers, all running on the same CPU.

This last sentence is now obsolete.  How about something like this?

	Their two major use cases are: (1) Mediating communication
	between process-level code and irq/NMI handlers, all running
	on the same CPU, and (2) Ensuring that the compiler does not
	fold, spindle, or otherwise mutilate accesses that either do
	not require ordering or that interact with an explicit memory
	barrier or atomic instruction that provides the required ordering.

> + */
> +
> +#define READ_ONCE(p) \
> +      ({ typeof(p) __val; __read_once_size(&p, &__val, sizeof(__val)); __val; })
> +
> +#define ASSIGN_ONCE(val, p) \
> +      ({ typeof(p) __val; __val = val; __assign_once_size(&p, &__val, sizeof(__val)); __val; })
> +
>  #endif /* __KERNEL__ */
> 
>  #endif /* __ASSEMBLY__ */
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/9] mm: replace ACCESS_ONCE with READ_ONCE or barriers
  2014-12-03 22:30 ` [PATCH 2/9] mm: replace ACCESS_ONCE with READ_ONCE or barriers Christian Borntraeger
@ 2014-12-04  0:09   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:09 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:14PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Let's change the code to access the page table elements with
> READ_ONCE that does implicit scalar accesses.
> 
> mm_find_pmd is tricky, because m68k and sparc(32bit) define pmd_t
> as array of longs. This code requires just that the pmd_present
> and pmd_trans_huge check are done on the same value, so a barrier
> is sufficent.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  mm/gup.c    | 2 +-
>  mm/memory.c | 2 +-
>  mm/rmap.c   | 3 ++-
>  3 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index cd62c8c..f2305de 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -917,7 +917,7 @@ static int gup_pud_range(pgd_t *pgdp, unsigned long addr, unsigned long end,
> 
>  	pudp = pud_offset(pgdp, addr);
>  	do {
> -		pud_t pud = ACCESS_ONCE(*pudp);
> +		pud_t pud = READ_ONCE(*pudp);
> 
>  		next = pud_addr_end(addr, end);
>  		if (pud_none(pud))
> diff --git a/mm/memory.c b/mm/memory.c
> index 3e50383..9e0c84e 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3202,7 +3202,7 @@ static int handle_pte_fault(struct mm_struct *mm,
>  	pte_t entry;
>  	spinlock_t *ptl;
> 
> -	entry = ACCESS_ONCE(*pte);
> +	entry = READ_ONCE(*pte);
>  	if (!pte_present(entry)) {
>  		if (pte_none(entry)) {
>  			if (vma->vm_ops) {
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 19886fb..1e54274 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -581,7 +581,8 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address)
>  	 * without holding anon_vma lock for write.  So when looking for a
>  	 * genuine pmde (in which to find pte), test present and !THP together.
>  	 */
> -	pmde = ACCESS_ONCE(*pmd);
> +	pmde = *pmd;
> +	barrier();
>  	if (!pmd_present(pmde) || pmd_trans_huge(pmde))
>  		pmd = NULL;
>  out:
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/9] x86/spinlock: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 ` [PATCH 3/9] x86/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
@ 2014-12-04  0:10   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:10 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:15PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/x86/include/asm/spinlock.h | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
> index 9295016..12a69b4 100644
> --- a/arch/x86/include/asm/spinlock.h
> +++ b/arch/x86/include/asm/spinlock.h
> @@ -92,7 +92,7 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>  		unsigned count = SPIN_THRESHOLD;
> 
>  		do {
> -			if (ACCESS_ONCE(lock->tickets.head) == inc.tail)
> +			if (READ_ONCE(lock->tickets.head) == inc.tail)
>  				goto out;
>  			cpu_relax();
>  		} while (--count);
> @@ -105,7 +105,7 @@ static __always_inline int arch_spin_trylock(arch_spinlock_t *lock)
>  {
>  	arch_spinlock_t old, new;
> 
> -	old.tickets = ACCESS_ONCE(lock->tickets);
> +	old.tickets = READ_ONCE(lock->tickets);
>  	if (old.tickets.head != (old.tickets.tail & ~TICKET_SLOWPATH_FLAG))
>  		return 0;
> 
> @@ -162,14 +162,14 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> 
>  static inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> -	struct __raw_tickets tmp = ACCESS_ONCE(lock->tickets);
> +	struct __raw_tickets tmp = READ_ONCE(lock->tickets);
> 
>  	return tmp.tail != tmp.head;
>  }
> 
>  static inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
> -	struct __raw_tickets tmp = ACCESS_ONCE(lock->tickets);
> +	struct __raw_tickets tmp = READ_ONCE(lock->tickets);
> 
>  	return (__ticket_t)(tmp.tail - tmp.head) > TICKET_LOCK_INC;
>  }
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/9] x86/gup: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 ` [PATCH 4/9] x86/gup: " Christian Borntraeger
@ 2014-12-04  0:10   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:10 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:16PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Change the gup code to replace ACCESS_ONCE with READ_ONCE.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/x86/mm/gup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
> index 207d9aef..d754782 100644
> --- a/arch/x86/mm/gup.c
> +++ b/arch/x86/mm/gup.c
> @@ -15,7 +15,7 @@
>  static inline pte_t gup_get_pte(pte_t *ptep)
>  {
>  #ifndef CONFIG_X86_PAE
> -	return ACCESS_ONCE(*ptep);
> +	return READ_ONCE(*ptep);
>  #else
>  	/*
>  	 * With get_user_pages_fast, we walk down the pagetables without taking
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 5/9] mips/gup: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 ` [PATCH 5/9] mips/gup: " Christian Borntraeger
@ 2014-12-04  0:11   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:11 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:17PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Change the gup code to replace ACCESS_ONCE with READ_ONCE.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/mips/mm/gup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/mips/mm/gup.c b/arch/mips/mm/gup.c
> index 06ce17c..8aa50e3 100644
> --- a/arch/mips/mm/gup.c
> +++ b/arch/mips/mm/gup.c
> @@ -30,7 +30,7 @@ retry:
> 
>  	return pte;
>  #else
> -	return ACCESS_ONCE(*ptep);
> +	return READ_ONCE(*ptep);
>  #endif
>  }
> 
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 6/9] arm64/spinlock: Replace ACCESS_ONCE READ_ONCE
  2014-12-03 22:30 ` [PATCH 6/9] arm64/spinlock: Replace ACCESS_ONCE READ_ONCE Christian Borntraeger
@ 2014-12-04  0:11   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:11 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:18PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/arm64/include/asm/spinlock.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
> index c45b7b1..cee1287 100644
> --- a/arch/arm64/include/asm/spinlock.h
> +++ b/arch/arm64/include/asm/spinlock.h
> @@ -99,12 +99,12 @@ static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> 
>  static inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> -	return !arch_spin_value_unlocked(ACCESS_ONCE(*lock));
> +	return !arch_spin_value_unlocked(READ_ONCE(*lock));
>  }
> 
>  static inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
> -	arch_spinlock_t lockval = ACCESS_ONCE(*lock);
> +	arch_spinlock_t lockval = READ_ONCE(*lock);
>  	return (lockval.next - lockval.owner) > 1;
>  }
>  #define arch_spin_is_contended	arch_spin_is_contended
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 7/9] arm/spinlock: Replace ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 ` [PATCH 7/9] arm/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
@ 2014-12-04  0:12   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:12 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:19PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Change the spinlock code to replace ACCESS_ONCE with READ_ONCE.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/arm/include/asm/spinlock.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/include/asm/spinlock.h b/arch/arm/include/asm/spinlock.h
> index ac4bfae..0fa4184 100644
> --- a/arch/arm/include/asm/spinlock.h
> +++ b/arch/arm/include/asm/spinlock.h
> @@ -120,12 +120,12 @@ static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> 
>  static inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> -	return !arch_spin_value_unlocked(ACCESS_ONCE(*lock));
> +	return !arch_spin_value_unlocked(READ_ONCE(*lock));
>  }
> 
>  static inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
> -	struct __raw_tickets tickets = ACCESS_ONCE(lock->tickets);
> +	struct __raw_tickets tickets = READ_ONCE(lock->tickets);
>  	return (tickets.next - tickets.owner) > 1;
>  }
>  #define arch_spin_is_contended	arch_spin_is_contended
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 8/9] s390/kvm: REPLACE ACCESS_ONCE with READ_ONCE
  2014-12-03 22:30 ` [PATCH 8/9] s390/kvm: REPLACE " Christian Borntraeger
@ 2014-12-04  0:12   ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:12 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:20PM +0100, Christian Borntraeger wrote:
> ACCESS_ONCE does not work reliably on non-scalar types. For
> example gcc 4.6 and 4.7 might remove the volatile tag for such
> accesses during the SRA (scalar replacement of aggregates) step
> (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> 
> Change the ipte lock code to replace ACCESS_ONCE with READ_ONCE.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  arch/s390/kvm/gaccess.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
> index 0f961a1..8f195fa 100644
> --- a/arch/s390/kvm/gaccess.c
> +++ b/arch/s390/kvm/gaccess.c
> @@ -229,10 +229,10 @@ static void ipte_lock_simple(struct kvm_vcpu *vcpu)
>  		goto out;
>  	ic = &vcpu->kvm->arch.sca->ipte_control;
>  	do {
> -		old = ACCESS_ONCE(*ic);
> +		old = READ_ONCE(*ic);
>  		while (old.k) {
>  			cond_resched();
> -			old = ACCESS_ONCE(*ic);
> +			old = READ_ONCE(*ic);
>  		}
>  		new = old;
>  		new.k = 1;
> @@ -251,7 +251,8 @@ static void ipte_unlock_simple(struct kvm_vcpu *vcpu)
>  		goto out;
>  	ic = &vcpu->kvm->arch.sca->ipte_control;
>  	do {
> -		new = old = ACCESS_ONCE(*ic);
> +		old = READ_ONCE(*ic);
> +		new = old;
>  		new.k = 0;
>  	} while (cmpxchg(&ic->val, old.val, new.val) != old.val);
>  	wake_up(&vcpu->kvm->arch.ipte_wq);
> @@ -265,10 +266,10 @@ static void ipte_lock_siif(struct kvm_vcpu *vcpu)
> 
>  	ic = &vcpu->kvm->arch.sca->ipte_control;
>  	do {
> -		old = ACCESS_ONCE(*ic);
> +		old = READ_ONCE(*ic);
>  		while (old.kg) {
>  			cond_resched();
> -			old = ACCESS_ONCE(*ic);
> +			old = READ_ONCE(*ic);
>  		}
>  		new = old;
>  		new.k = 1;
> @@ -282,7 +283,8 @@ static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
> 
>  	ic = &vcpu->kvm->arch.sca->ipte_control;
>  	do {
> -		new = old = ACCESS_ONCE(*ic);
> +		old = READ_ONCE(*ic);
> +		new = old;
>  		new.kh--;
>  		if (!new.kh)
>  			new.k = 0;
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] kernel: tighten rules for ACCESS ONCE
  2014-12-03 22:30 ` [PATCH 9/9] kernel: tighten rules for ACCESS ONCE Christian Borntraeger
@ 2014-12-04  0:16   ` Paul E. McKenney
  2014-12-04  9:28     ` Christian Borntraeger
  0 siblings, 1 reply; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04  0:16 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Wed, Dec 03, 2014 at 11:30:21PM +0100, Christian Borntraeger wrote:
> Now that all non-scalar users of ACCESS_ONCE have been converted
> to READ_ONCE or ASSIGN once, lets tighten ACCESS_ONCE to only
> work on scalar types.
> This variant was proposed by Alexei Starovoitov.
> 
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>

With or without the updated comment:

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/compiler.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 947710e..5ba91de 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -441,8 +441,16 @@ static __always_inline void __assign_once_size(volatile void *p, void *res, int
>   * merging, or refetching absolutely anything at any time.  Its main intended
>   * use is to mediate communication between process-level code and irq/NMI
>   * handlers, all running on the same CPU.

This comment is obsolete in the same way as that of READ_ONCE() and
ASSIGN_ONCE(), but probably more to the point to just get rid of
ACCESS_ONCE().  ;-)

> + *
> + * ACCESS_ONCE will only work on scalar types. For union types, ACCESS_ONCE
> + * on a union member will work as long as the size of the member matches the
> + * size of the union and the size is smaller than word size.
> + * If possible READ_ONCE/ASSIGN_ONCE should be used instead.
>   */
> -#define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x))
> +#define __ACCESS_ONCE(x) ({ \
> +	 __maybe_unused typeof(x) __var = 0; \
> +	(volatile typeof(x) *)&(x); })
> +#define ACCESS_ONCE(x) (*__ACCESS_ONCE(x))
> 
>  /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */
>  #ifdef CONFIG_KPROBES
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE
  2014-12-04  0:07   ` Paul E. McKenney
@ 2014-12-04  9:24     ` Christian Borntraeger
  2014-12-04 14:41       ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-04  9:24 UTC (permalink / raw)
  To: paulmck; +Cc: linux-kernel, linux-arch, torvalds

Am 04.12.2014 um 01:07 schrieb Paul E. McKenney:
> On Wed, Dec 03, 2014 at 11:30:13PM +0100, Christian Borntraeger wrote:
>> ACCESS_ONCE does not work reliably on non-scalar types. For
>> example gcc 4.6 and 4.7 might remove the volatile tag for such
>> accesses during the SRA (scalar replacement of aggregates) step
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
>>
>> Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
>> scalar types as suggested by Linus Torvalds. Accesses larger than
>> the machines word size cannot be guaranteed to be atomic. These
>> macros will use memcpy and emit a build warning.
>>
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> With or without some nits below addressed:
> 
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> 
>> ---
>>  include/linux/compiler.h | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 64 insertions(+)
>>
>> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
>> index d5ad7b1..947710e 100644
>> --- a/include/linux/compiler.h
>> +++ b/include/linux/compiler.h
>> @@ -186,6 +186,70 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
>>  # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
>>  #endif
>>
>> +#include <linux/types.h>
>> +
>> +void data_access_exceeds_word_size(void)
>> +__compiletime_warning("data access exceeds word size and won't be atomic");
>> +
>> +static __always_inline void __read_once_size(volatile void *p, void *res, int size)
>> +{
>> +	switch (size) {
>> +	case 1: *(u8 *)res = *(volatile u8 *)p; break;
>> +	case 2: *(u16 *)res = *(volatile u16 *)p; break;
>> +	case 4: *(u32 *)res = *(volatile u32 *)p; break;
>> +#ifdef CONFIG_64BIT
>> +	case 8: *(u64 *)res = *(volatile u64 *)p; break;
>> +#endif
> 
> You could get rid of the #ifdef above by doing "case sizeof(long)"
> and switching the casts from u64 to unsigned long.

Wouldnt that cause a compile warning because we have two case statements
with the same size?

> 
>> +	default:
>> +		barrier();
>> +		__builtin_memcpy((void *)res, (const void *)p, size);
>> +		data_access_exceeds_word_size();
>> +		barrier();
>> +	}
>> +}
>> +
>> +static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
>> +{
>> +	switch (size) {
>> +	case 1: *(volatile u8 *)p = *(u8 *)res; break;
>> +	case 2: *(volatile u16 *)p = *(u16 *)res; break;
>> +	case 4: *(volatile u32 *)p = *(u32 *)res; break;
>> +#ifdef CONFIG_64BIT
>> +	case 8: *(volatile u64 *)p = *(u64 *)res; break;
>> +#endif
> 
> Ditto.
> 
>> +	default:
>> +		barrier();
>> +		__builtin_memcpy((void *)p, (const void *)res, size);
>> +		data_access_exceeds_word_size();
>> +		barrier();
>> +	}
>> +}
>> +
>> +/*
>> + * Prevent the compiler from merging or refetching reads or writes. The compiler
>> + * is also forbidden from reordering successive instances of READ_ONCE,
>> + * ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the compiler is aware
>> + * of some particular ordering.  One way to make the compiler aware of ordering
>> + * is to put the two invocations of READ_ONCE, ASSIGN_ONCE or ACCESS_ONCE() in
>> + * different C statements.
>> + *
>> + * In contrast to ACCESS_ONCE these two macros will also work on aggregate data
>> + * types like structs or unions. If the size of the accessed data type exceeds
>> + * the word size of the machine (e.g. 32bit or 64bit), the access might happen
>> + * in multiple chunks, though.
> 
> This last sentence might be more clear if it was something like the
> following:
> 
> 	If the size of the accessed data type exceeeds the word size of
> 	the machine (e.g., 32 bits or 64 bits), ACCESS_ONCE() might
> 	carry out the access in multiple chunks, but READ_ONCE() and
> 	ASSIGN_ONCE() will give a link-time error.

The code in v4 now combines Linus (memcpy) and your (error) suggestion. We do a memcpy and a compile time _warning_

So I will do 

 * In contrast to ACCESS_ONCE these two macros will also work on aggregate
 * data types like structs or unions. If the size of the accessed data
 * type exceeds the word size of the machine (e.g., 32 bits or 64 bits)
 * READ_ONCE() and ASSIGN_ONCE()  will fall back to memcpy and print a
 * compile-time warning.



> 
>> + *
>> + * These macros do absolutely -nothing- to prevent the CPU from reordering,
>> + * merging, or refetching absolutely anything at any time.  Their main intended
>> + * use is to mediate communication between process-level code and irq/NMI
>> + * handlers, all running on the same CPU.
> 
> This last sentence is now obsolete.  How about something like this?
> 
> 	Their two major use cases are: (1) Mediating communication
> 	between process-level code and irq/NMI handlers, all running
> 	on the same CPU, and (2) Ensuring that the compiler does not
> 	fold, spindle, or otherwise mutilate accesses that either do
> 	not require ordering or that interact with an explicit memory
> 	barrier or atomic instruction that provides the required ordering.

sounds good. Will change.

Thanks

> 
>> + */
>> +
>> +#define READ_ONCE(p) \
>> +      ({ typeof(p) __val; __read_once_size(&p, &__val, sizeof(__val)); __val; })
>> +
>> +#define ASSIGN_ONCE(val, p) \
>> +      ({ typeof(p) __val; __val = val; __assign_once_size(&p, &__val, sizeof(__val)); __val; })
>> +
>>  #endif /* __KERNEL__ */
>>
>>  #endif /* __ASSEMBLY__ */
>> -- 
>> 1.9.3
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] kernel: tighten rules for ACCESS ONCE
  2014-12-04  0:16   ` Paul E. McKenney
@ 2014-12-04  9:28     ` Christian Borntraeger
  2014-12-04 14:41       ` Paul E. McKenney
  0 siblings, 1 reply; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-04  9:28 UTC (permalink / raw)
  To: paulmck; +Cc: linux-kernel, linux-arch, torvalds

Am 04.12.2014 um 01:16 schrieb Paul E. McKenney:
>>   * merging, or refetching absolutely anything at any time.  Its main intended
>> >   * use is to mediate communication between process-level code and irq/NMI
>> >   * handlers, all running on the same CPU.
> This comment is obsolete in the same way as that of READ_ONCE() and
> ASSIGN_ONCE(), but probably more to the point to just get rid of
> ACCESS_ONCE().  ;-)
> 
>> > 

Its now 

/*
 * Prevent the compiler from merging or refetching accesses.  The compiler
 * is also forbidden from reordering successive instances of ACCESS_ONCE(),
 * but only when the compiler is aware of some particular ordering.  One way
 * to make the compiler aware of ordering is to put the two invocations of
 * ACCESS_ONCE() in different C statements.
 *
 * ACCESS_ONCE will only work on scalar types. For union types, ACCESS_ONCE
 * on a union member will work as long as the size of the member matches the
 * size of the union and the size is smaller than word size.
 *
 * The major use cases of ACCESS_ONCE used to be (1) Mediating communication
 * between process-level code and irq/NMI handlers, all running on the same CPU,
 * and (2) Ensuring that the compiler does not  fold, spindle, or otherwise
 * mutilate accesses that either do not require ordering or that interact
 * with an explicit memory barrier or atomic instruction that provides the
 * required ordering.
 *
 * If possible use READ_ONCE/ASSIGN_ONCE instead.
 */


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE
  2014-12-04  9:24     ` Christian Borntraeger
@ 2014-12-04 14:41       ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04 14:41 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Thu, Dec 04, 2014 at 10:24:47AM +0100, Christian Borntraeger wrote:
> Am 04.12.2014 um 01:07 schrieb Paul E. McKenney:
> > On Wed, Dec 03, 2014 at 11:30:13PM +0100, Christian Borntraeger wrote:
> >> ACCESS_ONCE does not work reliably on non-scalar types. For
> >> example gcc 4.6 and 4.7 might remove the volatile tag for such
> >> accesses during the SRA (scalar replacement of aggregates) step
> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
> >>
> >> Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
> >> scalar types as suggested by Linus Torvalds. Accesses larger than
> >> the machines word size cannot be guaranteed to be atomic. These
> >> macros will use memcpy and emit a build warning.
> >>
> >> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> > 
> > With or without some nits below addressed:
> > 
> > Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > 
> >> ---
> >>  include/linux/compiler.h | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 64 insertions(+)
> >>
> >> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> >> index d5ad7b1..947710e 100644
> >> --- a/include/linux/compiler.h
> >> +++ b/include/linux/compiler.h
> >> @@ -186,6 +186,70 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
> >>  # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
> >>  #endif
> >>
> >> +#include <linux/types.h>
> >> +
> >> +void data_access_exceeds_word_size(void)
> >> +__compiletime_warning("data access exceeds word size and won't be atomic");
> >> +
> >> +static __always_inline void __read_once_size(volatile void *p, void *res, int size)
> >> +{
> >> +	switch (size) {
> >> +	case 1: *(u8 *)res = *(volatile u8 *)p; break;
> >> +	case 2: *(u16 *)res = *(volatile u16 *)p; break;
> >> +	case 4: *(u32 *)res = *(volatile u32 *)p; break;
> >> +#ifdef CONFIG_64BIT
> >> +	case 8: *(u64 *)res = *(volatile u64 *)p; break;
> >> +#endif
> > 
> > You could get rid of the #ifdef above by doing "case sizeof(long)"
> > and switching the casts from u64 to unsigned long.
> 
> Wouldnt that cause a compile warning because we have two case statements
> with the same size?

Indeed it would!  Color me blind and oblivious.

> >> +	default:
> >> +		barrier();
> >> +		__builtin_memcpy((void *)res, (const void *)p, size);
> >> +		data_access_exceeds_word_size();
> >> +		barrier();
> >> +	}
> >> +}
> >> +
> >> +static __always_inline void __assign_once_size(volatile void *p, void *res, int size)
> >> +{
> >> +	switch (size) {
> >> +	case 1: *(volatile u8 *)p = *(u8 *)res; break;
> >> +	case 2: *(volatile u16 *)p = *(u16 *)res; break;
> >> +	case 4: *(volatile u32 *)p = *(u32 *)res; break;
> >> +#ifdef CONFIG_64BIT
> >> +	case 8: *(volatile u64 *)p = *(u64 *)res; break;
> >> +#endif
> > 
> > Ditto.
> > 
> >> +	default:
> >> +		barrier();
> >> +		__builtin_memcpy((void *)p, (const void *)res, size);
> >> +		data_access_exceeds_word_size();
> >> +		barrier();
> >> +	}
> >> +}
> >> +
> >> +/*
> >> + * Prevent the compiler from merging or refetching reads or writes. The compiler
> >> + * is also forbidden from reordering successive instances of READ_ONCE,
> >> + * ASSIGN_ONCE and ACCESS_ONCE (see below), but only when the compiler is aware
> >> + * of some particular ordering.  One way to make the compiler aware of ordering
> >> + * is to put the two invocations of READ_ONCE, ASSIGN_ONCE or ACCESS_ONCE() in
> >> + * different C statements.
> >> + *
> >> + * In contrast to ACCESS_ONCE these two macros will also work on aggregate data
> >> + * types like structs or unions. If the size of the accessed data type exceeds
> >> + * the word size of the machine (e.g. 32bit or 64bit), the access might happen
> >> + * in multiple chunks, though.
> > 
> > This last sentence might be more clear if it was something like the
> > following:
> > 
> > 	If the size of the accessed data type exceeeds the word size of
> > 	the machine (e.g., 32 bits or 64 bits), ACCESS_ONCE() might
> > 	carry out the access in multiple chunks, but READ_ONCE() and
> > 	ASSIGN_ONCE() will give a link-time error.
> 
> The code in v4 now combines Linus (memcpy) and your (error) suggestion. We do a memcpy and a compile time _warning_
> 
> So I will do 
> 
>  * In contrast to ACCESS_ONCE these two macros will also work on aggregate
>  * data types like structs or unions. If the size of the accessed data
>  * type exceeds the word size of the machine (e.g., 32 bits or 64 bits)
>  * READ_ONCE() and ASSIGN_ONCE()  will fall back to memcpy and print a
>  * compile-time warning.

Very good!

							Thanx, Paul

> >> + *
> >> + * These macros do absolutely -nothing- to prevent the CPU from reordering,
> >> + * merging, or refetching absolutely anything at any time.  Their main intended
> >> + * use is to mediate communication between process-level code and irq/NMI
> >> + * handlers, all running on the same CPU.
> > 
> > This last sentence is now obsolete.  How about something like this?
> > 
> > 	Their two major use cases are: (1) Mediating communication
> > 	between process-level code and irq/NMI handlers, all running
> > 	on the same CPU, and (2) Ensuring that the compiler does not
> > 	fold, spindle, or otherwise mutilate accesses that either do
> > 	not require ordering or that interact with an explicit memory
> > 	barrier or atomic instruction that provides the required ordering.
> 
> sounds good. Will change.
> 
> Thanks
> 
> > 
> >> + */
> >> +
> >> +#define READ_ONCE(p) \
> >> +      ({ typeof(p) __val; __read_once_size(&p, &__val, sizeof(__val)); __val; })
> >> +
> >> +#define ASSIGN_ONCE(val, p) \
> >> +      ({ typeof(p) __val; __val = val; __assign_once_size(&p, &__val, sizeof(__val)); __val; })
> >> +
> >>  #endif /* __KERNEL__ */
> >>
> >>  #endif /* __ASSEMBLY__ */
> >> -- 
> >> 1.9.3
> >>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/9] kernel: tighten rules for ACCESS ONCE
  2014-12-04  9:28     ` Christian Borntraeger
@ 2014-12-04 14:41       ` Paul E. McKenney
  0 siblings, 0 replies; 25+ messages in thread
From: Paul E. McKenney @ 2014-12-04 14:41 UTC (permalink / raw)
  To: Christian Borntraeger; +Cc: linux-kernel, linux-arch, torvalds

On Thu, Dec 04, 2014 at 10:28:25AM +0100, Christian Borntraeger wrote:
> Am 04.12.2014 um 01:16 schrieb Paul E. McKenney:
> >>   * merging, or refetching absolutely anything at any time.  Its main intended
> >> >   * use is to mediate communication between process-level code and irq/NMI
> >> >   * handlers, all running on the same CPU.
> > This comment is obsolete in the same way as that of READ_ONCE() and
> > ASSIGN_ONCE(), but probably more to the point to just get rid of
> > ACCESS_ONCE().  ;-)
> > 
> >> > 
> 
> Its now 
> 
> /*
>  * Prevent the compiler from merging or refetching accesses.  The compiler
>  * is also forbidden from reordering successive instances of ACCESS_ONCE(),
>  * but only when the compiler is aware of some particular ordering.  One way
>  * to make the compiler aware of ordering is to put the two invocations of
>  * ACCESS_ONCE() in different C statements.
>  *
>  * ACCESS_ONCE will only work on scalar types. For union types, ACCESS_ONCE
>  * on a union member will work as long as the size of the member matches the
>  * size of the union and the size is smaller than word size.
>  *
>  * The major use cases of ACCESS_ONCE used to be (1) Mediating communication
>  * between process-level code and irq/NMI handlers, all running on the same CPU,
>  * and (2) Ensuring that the compiler does not  fold, spindle, or otherwise
>  * mutilate accesses that either do not require ordering or that interact
>  * with an explicit memory barrier or atomic instruction that provides the
>  * required ordering.
>  *
>  * If possible use READ_ONCE/ASSIGN_ONCE instead.
>  */

Looks good!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (8 preceding siblings ...)
  2014-12-03 22:30 ` [PATCH 9/9] kernel: tighten rules for ACCESS ONCE Christian Borntraeger
@ 2014-12-04 15:24 ` Christian Borntraeger
  2014-12-04 23:40 ` Linus Torvalds
  10 siblings, 0 replies; 25+ messages in thread
From: Christian Borntraeger @ 2014-12-04 15:24 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-arch, paulmck, torvalds

Am 03.12.2014 um 23:30 schrieb Christian Borntraeger:
> As discussed on LKML http://marc.info/?i=54611D86.4040306%40de.ibm.com
> ACCESS_ONCE might fail with specific compiler for non-scalar accesses.
> 
> Here is a set of patches to tackle that problem.
> 
> The first patch introduce READ_ONCE and ASSIGN_ONCE. If the data structure
> is larger than the machine word size memcpy is used and a warning is emitted.
> The next patches fix up all in-tree users of ACCESS_ONCE on non-scalar types.
> The last patch forces ACCESS_ONCE to work only on scalar types. 
> 
> I have cross-compiled the resulting kernel with defconfig and gcc 4.9 for
> microblaze, m68k, alpha, s390,x86_64, i686, sparc, sparc64, mips,
> ia64, arm and arm64.
> 
> Runtime tested on s390x and x86_64. I have also verified that ASSIGN_ONCE works
> as expected with some test changes as there are no user in this patch series.
> 
> Linus, ok for the next merge window?
> 
> Christian Borntraeger (9):
>   kernel: Provide READ_ONCE and ASSIGN_ONCE
>   mm: replace ACCESS_ONCE with READ_ONCE or barriers
>   x86/spinlock: Replace ACCESS_ONCE with READ_ONCE
>   x86/gup: Replace ACCESS_ONCE with READ_ONCE
>   mips/gup: Replace ACCESS_ONCE with READ_ONCE
>   arm64/spinlock: Replace ACCESS_ONCE READ_ONCE
>   arm/spinlock: Replace ACCESS_ONCE with READ_ONCE
>   s390/kvm: REPLACE ACCESS_ONCE with READ_ONCE
>   kernel: tighten rules for ACCESS ONCE
> 
>  arch/arm/include/asm/spinlock.h   |  4 +--
>  arch/arm64/include/asm/spinlock.h |  4 +--
>  arch/mips/mm/gup.c                |  2 +-
>  arch/s390/kvm/gaccess.c           | 14 ++++----
>  arch/x86/include/asm/spinlock.h   |  8 ++---
>  arch/x86/mm/gup.c                 |  2 +-
>  include/linux/compiler.h          | 74 ++++++++++++++++++++++++++++++++++++++-
>  mm/gup.c                          |  2 +-
>  mm/memory.c                       |  2 +-
>  mm/rmap.c                         |  3 +-
>  10 files changed, 95 insertions(+), 20 deletions(-)
> 

FWIW, the code is on

git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux.git linux-next
and
git://git.kernel.org/pub/scm/linux/kernel/git/borntraeger/linux.git ACCESS_ONCE

I rebased the queue against rc3 + the initial patch that triggered the  whole discussion.

Christian


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses
  2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
                   ` (9 preceding siblings ...)
  2014-12-04 15:24 ` [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
@ 2014-12-04 23:40 ` Linus Torvalds
  10 siblings, 0 replies; 25+ messages in thread
From: Linus Torvalds @ 2014-12-04 23:40 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Linux Kernel Mailing List, linux-arch, Paul McKenney

On Wed, Dec 3, 2014 at 2:30 PM, Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
>
> Linus, ok for the next merge window?

Yup, sounds like a plan to me.

              Linus

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-12-04 23:40 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-03 22:30 [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
2014-12-03 22:30 ` [PATCH 1/9] kernel: Provide READ_ONCE and ASSIGN_ONCE Christian Borntraeger
2014-12-04  0:07   ` Paul E. McKenney
2014-12-04  9:24     ` Christian Borntraeger
2014-12-04 14:41       ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 2/9] mm: replace ACCESS_ONCE with READ_ONCE or barriers Christian Borntraeger
2014-12-04  0:09   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 3/9] x86/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
2014-12-04  0:10   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 4/9] x86/gup: " Christian Borntraeger
2014-12-04  0:10   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 5/9] mips/gup: " Christian Borntraeger
2014-12-04  0:11   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 6/9] arm64/spinlock: Replace ACCESS_ONCE READ_ONCE Christian Borntraeger
2014-12-04  0:11   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 7/9] arm/spinlock: Replace ACCESS_ONCE with READ_ONCE Christian Borntraeger
2014-12-04  0:12   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 8/9] s390/kvm: REPLACE " Christian Borntraeger
2014-12-04  0:12   ` Paul E. McKenney
2014-12-03 22:30 ` [PATCH 9/9] kernel: tighten rules for ACCESS ONCE Christian Borntraeger
2014-12-04  0:16   ` Paul E. McKenney
2014-12-04  9:28     ` Christian Borntraeger
2014-12-04 14:41       ` Paul E. McKenney
2014-12-04 15:24 ` [PATCHv4 0/9] ACCESS_ONCE and non-scalar accesses Christian Borntraeger
2014-12-04 23:40 ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).