* [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup
@ 2022-08-08 7:13 guoren
2022-08-08 7:13 ` [PATCH V9 01/15] asm-generic: ticket-lock: Remove unnecessary atomic_read guoren
` (15 more replies)
0 siblings, 16 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
In this series:
- Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as RCsc)
- Add qspinlock and combo-lock for riscv
- Add qspinlock to openrisc
- Use generic header in csky
- Optimize cmpxchg & atomic code
Enable qspinlock and meet the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").
RISC-V LR/SC pairs could provide a strong/weak forward guarantee that
depends on micro-architecture. And RISC-V ISA spec has given out
several limitations to let hardware support strict forward guarantee
(RISC-V User ISA - 8.3 Eventual Success of Store-Conditional
Instructions).
eg:
Some riscv hardware such as BOOMv3 & XiangShan could provide strict &
strong forward guarantee (The cache line would be kept in an exclusive
state for Backoff cycles, and only this core's interrupt could break
the LR/SC pair).
Qemu riscv give a weak forward guarantee by wrong implementation
currently [1].
So we Add combo spinlock (ticket & queued) support for riscv. Thus different
kinds of memory model micro-arch processors could use the same Image
The first try of qspinlock for riscv was made in 2019.1 [2].
[1] https://github.com/qemu/qemu/blob/master/target/riscv/insn_trans/trans_rva.c.inc
[2] https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/#r
Guo Ren (15):
asm-generic: ticket-lock: Remove unnecessary atomic_read
asm-generic: ticket-lock: Use the same struct definitions with qspinlock
asm-generic: ticket-lock: Move into ticket_spinlock.h
asm-generic: ticket-lock: Keep ticket-lock the same semantic with qspinlock
asm-generic: spinlock: Add queued spinlock support in common header
riscv: atomic: Clean up unnecessary acquire and release definitions
riscv: cmpxchg: Remove xchg32 and xchg64
riscv: cmpxchg: Forbid arch_cmpxchg64 for 32-bit
riscv: cmpxchg: Optimize cmpxchg64
riscv: Enable ARCH_INLINE_READ*/WRITE*/SPIN*
riscv: Add qspinlock support
riscv: Add combo spinlock support
openrisc: cmpxchg: Cleanup unnecessary codes
openrisc: Move from ticket-lock to qspinlock
csky: spinlock: Use the generic header files
arch/csky/include/asm/Kbuild | 2 +
arch/csky/include/asm/spinlock.h | 12 --
arch/csky/include/asm/spinlock_types.h | 9 --
arch/openrisc/Kconfig | 1 +
arch/openrisc/include/asm/Kbuild | 2 +
arch/openrisc/include/asm/cmpxchg.h | 192 ++++++++++---------------
arch/riscv/Kconfig | 49 +++++++
arch/riscv/include/asm/Kbuild | 3 +-
arch/riscv/include/asm/atomic.h | 19 ---
arch/riscv/include/asm/cmpxchg.h | 177 +++++++----------------
arch/riscv/include/asm/spinlock.h | 77 ++++++++++
arch/riscv/kernel/setup.c | 22 +++
include/asm-generic/spinlock.h | 94 ++----------
include/asm-generic/spinlock_types.h | 12 +-
include/asm-generic/ticket_spinlock.h | 93 ++++++++++++
15 files changed, 384 insertions(+), 380 deletions(-)
delete mode 100644 arch/csky/include/asm/spinlock.h
delete mode 100644 arch/csky/include/asm/spinlock_types.h
create mode 100644 arch/riscv/include/asm/spinlock.h
create mode 100644 include/asm-generic/ticket_spinlock.h
--
2.36.1
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH V9 01/15] asm-generic: ticket-lock: Remove unnecessary atomic_read
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 02/15] asm-generic: ticket-lock: Use the same struct definitions with qspinlock guoren
` (14 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
because the value has been in lock. This patch could prevent
arch_spin_value_unlocked contend spin_lock data again.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
include/asm-generic/spinlock.h | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index fdfebcb050f4..90803a826ba0 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -68,11 +68,18 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
smp_store_release(ptr, (u16)val + 1);
}
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+{
+ u32 val = lock.counter;
+
+ return ((val >> 16) == (val & 0xffff));
+}
+
static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
{
- u32 val = atomic_read(lock);
+ arch_spinlock_t val = READ_ONCE(*lock);
- return ((val >> 16) != (val & 0xffff));
+ return !arch_spin_value_unlocked(val);
}
static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
@@ -82,11 +89,6 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
return (s16)((val >> 16) - (val & 0xffff)) > 1;
}
-static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
-{
- return !arch_spin_is_locked(&lock);
-}
-
#include <asm/qrwlock.h>
#endif /* __ASM_GENERIC_SPINLOCK_H */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 02/15] asm-generic: ticket-lock: Use the same struct definitions with qspinlock
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
2022-08-08 7:13 ` [PATCH V9 01/15] asm-generic: ticket-lock: Remove unnecessary atomic_read guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 03/15] asm-generic: ticket-lock: Move into ticket_spinlock.h guoren
` (13 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Let ticket_lock use the same struct definitions with qspinlock, and then
we could move to combo spinlock (combine ticket & queue).
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
include/asm-generic/spinlock.h | 14 +++++++-------
include/asm-generic/spinlock_types.h | 12 ++----------
2 files changed, 9 insertions(+), 17 deletions(-)
diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 90803a826ba0..4773334ee638 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -32,7 +32,7 @@
static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
{
- u32 val = atomic_fetch_add(1<<16, lock);
+ u32 val = atomic_fetch_add(1<<16, &lock->val);
u16 ticket = val >> 16;
if (ticket == (u16)val)
@@ -46,31 +46,31 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
* have no outstanding writes due to the atomic_fetch_add() the extra
* orderings are free.
*/
- atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+ atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
smp_mb();
}
static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
{
- u32 old = atomic_read(lock);
+ u32 old = atomic_read(&lock->val);
if ((old >> 16) != (old & 0xffff))
return false;
- return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+ return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
}
static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
{
u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
- u32 val = atomic_read(lock);
+ u32 val = atomic_read(&lock->val);
smp_store_release(ptr, (u16)val + 1);
}
static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
{
- u32 val = lock.counter;
+ u32 val = lock.val.counter;
return ((val >> 16) == (val & 0xffff));
}
@@ -84,7 +84,7 @@ static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
{
- u32 val = atomic_read(lock);
+ u32 val = atomic_read(&lock->val);
return (s16)((val >> 16) - (val & 0xffff)) > 1;
}
diff --git a/include/asm-generic/spinlock_types.h b/include/asm-generic/spinlock_types.h
index 8962bb730945..f534aa5de394 100644
--- a/include/asm-generic/spinlock_types.h
+++ b/include/asm-generic/spinlock_types.h
@@ -3,15 +3,7 @@
#ifndef __ASM_GENERIC_SPINLOCK_TYPES_H
#define __ASM_GENERIC_SPINLOCK_TYPES_H
-#include <linux/types.h>
-typedef atomic_t arch_spinlock_t;
-
-/*
- * qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
- * include.
- */
-#include <asm/qrwlock_types.h>
-
-#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0)
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
#endif /* __ASM_GENERIC_SPINLOCK_TYPES_H */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 03/15] asm-generic: ticket-lock: Move into ticket_spinlock.h
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
2022-08-08 7:13 ` [PATCH V9 01/15] asm-generic: ticket-lock: Remove unnecessary atomic_read guoren
2022-08-08 7:13 ` [PATCH V9 02/15] asm-generic: ticket-lock: Use the same struct definitions with qspinlock guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 04/15] asm-generic: ticket-lock: Keep ticket-lock the same semantic with qspinlock guoren
` (12 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Move ticket-lock definition into an independent file. It's a preparation
patch for merging qspinlock into asm-generic spinlock.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
include/asm-generic/spinlock.h | 87 +---------------------
include/asm-generic/ticket_spinlock.h | 103 ++++++++++++++++++++++++++
2 files changed, 104 insertions(+), 86 deletions(-)
create mode 100644 include/asm-generic/ticket_spinlock.h
diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 4773334ee638..970590baf61b 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -1,94 +1,9 @@
/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * 'Generic' ticket-lock implementation.
- *
- * It relies on atomic_fetch_add() having well defined forward progress
- * guarantees under contention. If your architecture cannot provide this, stick
- * to a test-and-set lock.
- *
- * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
- * sub-word of the value. This is generally true for anything LL/SC although
- * you'd be hard pressed to find anything useful in architecture specifications
- * about this. If your architecture cannot do this you might be better off with
- * a test-and-set.
- *
- * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
- * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
- * a full fence after the spin to upgrade the otherwise-RCpc
- * atomic_cond_read_acquire().
- *
- * The implementation uses smp_cond_load_acquire() to spin, so if the
- * architecture has WFE like instructions to sleep instead of poll for word
- * modifications be sure to implement that (see ARM64 for example).
- *
- */
-
#ifndef __ASM_GENERIC_SPINLOCK_H
#define __ASM_GENERIC_SPINLOCK_H
-#include <linux/atomic.h>
-#include <asm-generic/spinlock_types.h>
-
-static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
-{
- u32 val = atomic_fetch_add(1<<16, &lock->val);
- u16 ticket = val >> 16;
-
- if (ticket == (u16)val)
- return;
-
- /*
- * atomic_cond_read_acquire() is RCpc, but rather than defining a
- * custom cond_read_rcsc() here we just emit a full fence. We only
- * need the prior reads before subsequent writes ordering from
- * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
- * have no outstanding writes due to the atomic_fetch_add() the extra
- * orderings are free.
- */
- atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
- smp_mb();
-}
-
-static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
-{
- u32 old = atomic_read(&lock->val);
-
- if ((old >> 16) != (old & 0xffff))
- return false;
-
- return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
-}
-
-static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
- u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
- u32 val = atomic_read(&lock->val);
-
- smp_store_release(ptr, (u16)val + 1);
-}
-
-static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
-{
- u32 val = lock.val.counter;
-
- return ((val >> 16) == (val & 0xffff));
-}
-
-static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
-{
- arch_spinlock_t val = READ_ONCE(*lock);
-
- return !arch_spin_value_unlocked(val);
-}
-
-static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
-{
- u32 val = atomic_read(&lock->val);
-
- return (s16)((val >> 16) - (val & 0xffff)) > 1;
-}
-
+#include <asm-generic/ticket_spinlock.h>
#include <asm/qrwlock.h>
#endif /* __ASM_GENERIC_SPINLOCK_H */
diff --git a/include/asm-generic/ticket_spinlock.h b/include/asm-generic/ticket_spinlock.h
new file mode 100644
index 000000000000..cfcff22b37b3
--- /dev/null
+++ b/include/asm-generic/ticket_spinlock.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
+ * a full fence after the spin to upgrade the otherwise-RCpc
+ * atomic_cond_read_acquire().
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
+ */
+
+#ifndef __ASM_GENERIC_TICKET_SPINLOCK_H
+#define __ASM_GENERIC_TICKET_SPINLOCK_H
+
+#include <linux/atomic.h>
+#include <asm-generic/spinlock_types.h>
+
+static __always_inline void ticket_spin_lock(arch_spinlock_t *lock)
+{
+ u32 val = atomic_fetch_add(1<<16, &lock->val);
+ u16 ticket = val >> 16;
+
+ if (ticket == (u16)val)
+ return;
+
+ /*
+ * atomic_cond_read_acquire() is RCpc, but rather than defining a
+ * custom cond_read_rcsc() here we just emit a full fence. We only
+ * need the prior reads before subsequent writes ordering from
+ * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
+ * have no outstanding writes due to the atomic_fetch_add() the extra
+ * orderings are free.
+ */
+ atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
+ smp_mb();
+}
+
+static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
+{
+ u32 old = atomic_read(&lock->val);
+
+ if ((old >> 16) != (old & 0xffff))
+ return false;
+
+ return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void ticket_spin_unlock(arch_spinlock_t *lock)
+{
+ u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
+ u32 val = atomic_read(&lock->val);
+
+ smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int ticket_spin_value_unlocked(arch_spinlock_t lock)
+{
+ u32 val = lock.val.counter;
+
+ return ((val >> 16) == (val & 0xffff));
+}
+
+static __always_inline int ticket_spin_is_locked(arch_spinlock_t *lock)
+{
+ arch_spinlock_t val = READ_ONCE(*lock);
+
+ return !ticket_spin_value_unlocked(val);
+}
+
+static __always_inline int ticket_spin_is_contended(arch_spinlock_t *lock)
+{
+ u32 val = atomic_read(&lock->val);
+
+ return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+/*
+ * Remapping spinlock architecture specific functions to the corresponding
+ * ticket spinlock functions.
+ */
+#define arch_spin_is_locked(l) ticket_spin_is_locked(l)
+#define arch_spin_is_contended(l) ticket_spin_is_contended(l)
+#define arch_spin_value_unlocked(l) ticket_spin_value_unlocked(l)
+#define arch_spin_lock(l) ticket_spin_lock(l)
+#define arch_spin_trylock(l) ticket_spin_trylock(l)
+#define arch_spin_unlock(l) ticket_spin_unlock(l)
+
+#endif /* __ASM_GENERIC_TICKET_SPINLOCK_H */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 04/15] asm-generic: ticket-lock: Keep ticket-lock the same semantic with qspinlock
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (2 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 03/15] asm-generic: ticket-lock: Move into ticket_spinlock.h guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 05/15] asm-generic: spinlock: Add queued spinlock support in common header guoren
` (11 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Define smp_mb__after_spinlock by smp_mb as default behavior to give RCsc
synchronization point for all architectures. Keep the same semantic with
qspinlock, a acquire (RCpc) synchronization point. More detail, see
include/linux/spinlock.h.
Some architectures could give more robust semantics than smp_mb, eg.
riscv. Some architectures needn't smp_mb__after_spinlock because their
spinlocks have contained an RCsc.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
---
include/asm-generic/spinlock.h | 5 +++++
include/asm-generic/ticket_spinlock.h | 18 ++++--------------
2 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 970590baf61b..6f5a1b838ca2 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -6,4 +6,9 @@
#include <asm-generic/ticket_spinlock.h>
#include <asm/qrwlock.h>
+/* See include/linux/spinlock.h */
+#ifndef smp_mb__after_spinlock
+#define smp_mb__after_spinlock() smp_mb()
+#endif
+
#endif /* __ASM_GENERIC_SPINLOCK_H */
diff --git a/include/asm-generic/ticket_spinlock.h b/include/asm-generic/ticket_spinlock.h
index cfcff22b37b3..d8e6ec82f096 100644
--- a/include/asm-generic/ticket_spinlock.h
+++ b/include/asm-generic/ticket_spinlock.h
@@ -14,9 +14,8 @@
* a test-and-set.
*
* It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
- * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
- * a full fence after the spin to upgrade the otherwise-RCpc
- * atomic_cond_read_acquire().
+ * uses smp_mb__after_spinlock which is RCsc to create an RCsc hot path, See
+ * include/linux/spinlock.h
*
* The implementation uses smp_cond_load_acquire() to spin, so if the
* architecture has WFE like instructions to sleep instead of poll for word
@@ -32,22 +31,13 @@
static __always_inline void ticket_spin_lock(arch_spinlock_t *lock)
{
- u32 val = atomic_fetch_add(1<<16, &lock->val);
+ u32 val = atomic_fetch_add_acquire(1<<16, &lock->val);
u16 ticket = val >> 16;
if (ticket == (u16)val)
return;
- /*
- * atomic_cond_read_acquire() is RCpc, but rather than defining a
- * custom cond_read_rcsc() here we just emit a full fence. We only
- * need the prior reads before subsequent writes ordering from
- * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
- * have no outstanding writes due to the atomic_fetch_add() the extra
- * orderings are free.
- */
atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
- smp_mb();
}
static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
@@ -57,7 +47,7 @@ static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
if ((old >> 16) != (old & 0xffff))
return false;
- return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+ return atomic_try_cmpxchg_acquire(&lock->val, &old, old + (1<<16));
}
static __always_inline void ticket_spin_unlock(arch_spinlock_t *lock)
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 05/15] asm-generic: spinlock: Add queued spinlock support in common header
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (3 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 04/15] asm-generic: ticket-lock: Keep ticket-lock the same semantic with qspinlock guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 06/15] riscv: atomic: Clean up unnecessary acquire and release definitions guoren
` (10 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Select queued spinlock or ticket lock by CONFIG_QUEUED_SPINLOCKS in
the common header file. Define smp_mb__after_spinlock with smp_mb()
as default.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
include/asm-generic/spinlock.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 6f5a1b838ca2..349cdb46a99c 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -3,7 +3,11 @@
#ifndef __ASM_GENERIC_SPINLOCK_H
#define __ASM_GENERIC_SPINLOCK_H
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm-generic/qspinlock.h>
+#else
#include <asm-generic/ticket_spinlock.h>
+#endif
#include <asm/qrwlock.h>
/* See include/linux/spinlock.h */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 06/15] riscv: atomic: Clean up unnecessary acquire and release definitions
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (4 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 05/15] asm-generic: spinlock: Add queued spinlock support in common header guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 07/15] riscv: cmpxchg: Remove xchg32 and xchg64 guoren
` (9 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Clean up unnecessary xchg_acquire, xchg_release, and cmpxchg_release
custom definitions, because the generic implementation is the same as
the riscv custom implementation.
Before the patch:
000000000000024e <.LBB238>:
ops = xchg_acquire(pending_ipis, 0);
24e: 089937af amoswap.d a5,s1,(s2)
252: 0230000f fence r,rw
0000000000000256 <.LBB243>:
ops = xchg_release(pending_ipis, 0);
256: 0310000f fence rw,w
25a: 089934af amoswap.d s1,s1,(s2)
After the patch:
000000000000026e <.LBB245>:
ops = xchg_acquire(pending_ipis, 0);
26e: 089937af amoswap.d a5,s1,(s2)
0000000000000272 <.LBE247>:
272: 0230000f fence r,rw
0000000000000276 <.LBB249>:
ops = xchg_release(pending_ipis, 0);
276: 0310000f fence rw,w
000000000000027a <.LBB251>:
27a: 089934af amoswap.d s1,s1,(s2)
Only cmpxchg_acquire is necessary (It prevents unnecessary acquire
ordering when the value from lr is different from old).
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/include/asm/atomic.h | 19 -----
arch/riscv/include/asm/cmpxchg.h | 116 -------------------------------
2 files changed, 135 deletions(-)
diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 0dfe9d857a76..83636320ba95 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -249,16 +249,6 @@ c_t arch_atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
return __xchg_relaxed(&(v->counter), n, size); \
} \
static __always_inline \
-c_t arch_atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
-{ \
- return __xchg_acquire(&(v->counter), n, size); \
-} \
-static __always_inline \
-c_t arch_atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
-{ \
- return __xchg_release(&(v->counter), n, size); \
-} \
-static __always_inline \
c_t arch_atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n) \
{ \
return __xchg(&(v->counter), n, size); \
@@ -276,12 +266,6 @@ c_t arch_atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v, \
return __cmpxchg_acquire(&(v->counter), o, n, size); \
} \
static __always_inline \
-c_t arch_atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v, \
- c_t o, c_t n) \
-{ \
- return __cmpxchg_release(&(v->counter), o, n, size); \
-} \
-static __always_inline \
c_t arch_atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
{ \
return __cmpxchg(&(v->counter), o, n, size); \
@@ -299,12 +283,9 @@ c_t arch_atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
ATOMIC_OPS()
#define arch_atomic_xchg_relaxed arch_atomic_xchg_relaxed
-#define arch_atomic_xchg_acquire arch_atomic_xchg_acquire
-#define arch_atomic_xchg_release arch_atomic_xchg_release
#define arch_atomic_xchg arch_atomic_xchg
#define arch_atomic_cmpxchg_relaxed arch_atomic_cmpxchg_relaxed
#define arch_atomic_cmpxchg_acquire arch_atomic_cmpxchg_acquire
-#define arch_atomic_cmpxchg_release arch_atomic_cmpxchg_release
#define arch_atomic_cmpxchg arch_atomic_cmpxchg
#undef ATOMIC_OPS
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 12debce235e5..67ab6375b650 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -44,76 +44,6 @@
_x_, sizeof(*(ptr))); \
})
-#define __xchg_acquire(ptr, new, size) \
-({ \
- __typeof__(ptr) __ptr = (ptr); \
- __typeof__(new) __new = (new); \
- __typeof__(*(ptr)) __ret; \
- switch (size) { \
- case 4: \
- __asm__ __volatile__ ( \
- " amoswap.w %0, %2, %1\n" \
- RISCV_ACQUIRE_BARRIER \
- : "=r" (__ret), "+A" (*__ptr) \
- : "r" (__new) \
- : "memory"); \
- break; \
- case 8: \
- __asm__ __volatile__ ( \
- " amoswap.d %0, %2, %1\n" \
- RISCV_ACQUIRE_BARRIER \
- : "=r" (__ret), "+A" (*__ptr) \
- : "r" (__new) \
- : "memory"); \
- break; \
- default: \
- BUILD_BUG(); \
- } \
- __ret; \
-})
-
-#define arch_xchg_acquire(ptr, x) \
-({ \
- __typeof__(*(ptr)) _x_ = (x); \
- (__typeof__(*(ptr))) __xchg_acquire((ptr), \
- _x_, sizeof(*(ptr))); \
-})
-
-#define __xchg_release(ptr, new, size) \
-({ \
- __typeof__(ptr) __ptr = (ptr); \
- __typeof__(new) __new = (new); \
- __typeof__(*(ptr)) __ret; \
- switch (size) { \
- case 4: \
- __asm__ __volatile__ ( \
- RISCV_RELEASE_BARRIER \
- " amoswap.w %0, %2, %1\n" \
- : "=r" (__ret), "+A" (*__ptr) \
- : "r" (__new) \
- : "memory"); \
- break; \
- case 8: \
- __asm__ __volatile__ ( \
- RISCV_RELEASE_BARRIER \
- " amoswap.d %0, %2, %1\n" \
- : "=r" (__ret), "+A" (*__ptr) \
- : "r" (__new) \
- : "memory"); \
- break; \
- default: \
- BUILD_BUG(); \
- } \
- __ret; \
-})
-
-#define arch_xchg_release(ptr, x) \
-({ \
- __typeof__(*(ptr)) _x_ = (x); \
- (__typeof__(*(ptr))) __xchg_release((ptr), \
- _x_, sizeof(*(ptr))); \
-})
-
#define __xchg(ptr, new, size) \
({ \
__typeof__(ptr) __ptr = (ptr); \
@@ -253,52 +183,6 @@
_o_, _n_, sizeof(*(ptr))); \
})
-#define __cmpxchg_release(ptr, old, new, size) \
-({ \
- __typeof__(ptr) __ptr = (ptr); \
- __typeof__(*(ptr)) __old = (old); \
- __typeof__(*(ptr)) __new = (new); \
- __typeof__(*(ptr)) __ret; \
- register unsigned int __rc; \
- switch (size) { \
- case 4: \
- __asm__ __volatile__ ( \
- RISCV_RELEASE_BARRIER \
- "0: lr.w %0, %2\n" \
- " bne %0, %z3, 1f\n" \
- " sc.w %1, %z4, %2\n" \
- " bnez %1, 0b\n" \
- "1:\n" \
- : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
- : "rJ" ((long)__old), "rJ" (__new) \
- : "memory"); \
- break; \
- case 8: \
- __asm__ __volatile__ ( \
- RISCV_RELEASE_BARRIER \
- "0: lr.d %0, %2\n" \
- " bne %0, %z3, 1f\n" \
- " sc.d %1, %z4, %2\n" \
- " bnez %1, 0b\n" \
- "1:\n" \
- : "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
- : "rJ" (__old), "rJ" (__new) \
- : "memory"); \
- break; \
- default: \
- BUILD_BUG(); \
- } \
- __ret; \
-})
-
-#define arch_cmpxchg_release(ptr, o, n) \
-({ \
- __typeof__(*(ptr)) _o_ = (o); \
- __typeof__(*(ptr)) _n_ = (n); \
- (__typeof__(*(ptr))) __cmpxchg_release((ptr), \
- _o_, _n_, sizeof(*(ptr))); \
-})
-
#define __cmpxchg(ptr, old, new, size) \
({ \
__typeof__(ptr) __ptr = (ptr); \
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 07/15] riscv: cmpxchg: Remove xchg32 and xchg64
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (5 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 06/15] riscv: atomic: Clean up unnecessary acquire and release definitions guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 08/15] riscv: cmpxchg: Forbid arch_cmpxchg64 for 32-bit guoren
` (8 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
The xchg32 and xchg64 are unused, so remove them.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/include/asm/cmpxchg.h | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 67ab6375b650..567ed2e274c4 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -76,18 +76,6 @@
(__typeof__(*(ptr))) __xchg((ptr), _x_, sizeof(*(ptr))); \
})
-#define xchg32(ptr, x) \
-({ \
- BUILD_BUG_ON(sizeof(*(ptr)) != 4); \
- arch_xchg((ptr), (x)); \
-})
-
-#define xchg64(ptr, x) \
-({ \
- BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
- arch_xchg((ptr), (x)); \
-})
-
/*
* Atomic compare and exchange. Compare OLD with MEM, if identical,
* store NEW in MEM. Return the initial value in MEM. Success is
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 08/15] riscv: cmpxchg: Forbid arch_cmpxchg64 for 32-bit
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (6 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 07/15] riscv: cmpxchg: Remove xchg32 and xchg64 guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 09/15] riscv: cmpxchg: Optimize cmpxchg64 guoren
` (7 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
RISC-V 32-bit couldn't support lr.d/sc.d instructions, so using
arch_cmpxchg64 would cause error. Add forbid code to prevent the
situation.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/include/asm/cmpxchg.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 567ed2e274c4..14c9280c7f7f 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -25,6 +25,7 @@
: "memory"); \
break; \
case 8: \
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT)); \
__asm__ __volatile__ ( \
" amoswap.d %0, %2, %1\n" \
: "=r" (__ret), "+A" (*__ptr) \
@@ -58,6 +59,7 @@
: "memory"); \
break; \
case 8: \
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT)); \
__asm__ __volatile__ ( \
" amoswap.d.aqrl %0, %2, %1\n" \
: "=r" (__ret), "+A" (*__ptr) \
@@ -101,6 +103,7 @@
: "memory"); \
break; \
case 8: \
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT)); \
__asm__ __volatile__ ( \
"0: lr.d %0, %2\n" \
" bne %0, %z3, 1f\n" \
@@ -146,6 +149,7 @@
: "memory"); \
break; \
case 8: \
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT)); \
__asm__ __volatile__ ( \
"0: lr.d %0, %2\n" \
" bne %0, %z3, 1f\n" \
@@ -192,6 +196,7 @@
: "memory"); \
break; \
case 8: \
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT)); \
__asm__ __volatile__ ( \
"0: lr.d %0, %2\n" \
" bne %0, %z3, 1f\n" \
@@ -220,6 +225,7 @@
#define arch_cmpxchg_local(ptr, o, n) \
(__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr))))
+#ifdef CONFIG_64BIT
#define arch_cmpxchg64(ptr, o, n) \
({ \
BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
@@ -231,5 +237,6 @@
BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
arch_cmpxchg_relaxed((ptr), (o), (n)); \
})
+#endif /* CONFIG_64BIT */
#endif /* _ASM_RISCV_CMPXCHG_H */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 09/15] riscv: cmpxchg: Optimize cmpxchg64
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (7 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 08/15] riscv: cmpxchg: Forbid arch_cmpxchg64 for 32-bit guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 10/15] riscv: Enable ARCH_INLINE_READ*/WRITE*/SPIN* guoren
` (6 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Optimize cmpxchg64 with relaxed, acquire, release implementation.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 14c9280c7f7f..4b5fa25f4336 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -226,6 +226,24 @@
(__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr))))
#ifdef CONFIG_64BIT
+#define arch_cmpxchg64_relaxed(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
+ arch_cmpxchg_relaxed((ptr), (o), (n)); \
+})
+
+#define arch_cmpxchg64_acquire(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
+ arch_cmpxchg_acquire((ptr), (o), (n)); \
+})
+
+#define arch_cmpxchg64_release(ptr, o, n) \
+({ \
+ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
+ arch_cmpxchg_release((ptr), (o), (n)); \
+})
+
#define arch_cmpxchg64(ptr, o, n) \
({ \
BUILD_BUG_ON(sizeof(*(ptr)) != 8); \
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 10/15] riscv: Enable ARCH_INLINE_READ*/WRITE*/SPIN*
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (8 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 09/15] riscv: cmpxchg: Optimize cmpxchg64 guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 11/15] riscv: Add qspinlock support guoren
` (5 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Enable ARCH_INLINE_READ*/WRITE*/SPIN* when !PREEMPTION, it is copied
from arch/arm64. It could reduce procedure calls and improves
performance.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/Kconfig | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 51713e03c934..c3ca23bc6352 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -32,6 +32,32 @@ config RISCV
select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_INLINE_READ_LOCK if !PREEMPTION
+ select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
+ select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
+ select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
+ select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
+ select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
+ select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
+ select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
+ select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
+ select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
+ select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
+ select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
+ select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
+ select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
+ select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
+ select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
+ select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
+ select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
+ select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
+ select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
+ select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
+ select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
+ select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
+ select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
+ select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
+ select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
select ARCH_STACKWALK
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 11/15] riscv: Add qspinlock support
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (9 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 10/15] riscv: Enable ARCH_INLINE_READ*/WRITE*/SPIN* guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 12/15] riscv: Add combo spinlock support guoren
` (4 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Enable qspinlock by the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").
- RISC-V atomic_*_release()/atomic_*_acquire() are implemented with
own relaxed version plus acquire/release_fence for RCsc
synchronization.
- RISC-V LR/SC pairs could provide a strong/weak forward guarantee
that depends on micro-architecture. And RISC-V ISA spec has given
out several limitations to let hardware support strict forward
guarantee (RISC-V User ISA - 8.3 Eventual Success of
Store-Conditional Instructions). Some riscv cores such as BOOMv3
& XiangShan could provide strict & strong forward guarantee (The
cache line would be kept in an exclusive state for Backoff cycles,
and only this core's interrupt could break the LR/SC pair).
- RISC-V provides cheap atomic_fetch_or_acquire() with RCsc.
- RISC-V only provides relaxed xchg16 to support qspinlock.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/Kconfig | 16 ++++++++++++++++
arch/riscv/include/asm/Kbuild | 2 ++
arch/riscv/include/asm/cmpxchg.h | 24 ++++++++++++++++++++++++
3 files changed, 42 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index c3ca23bc6352..8b36a4307d03 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -359,6 +359,22 @@ config NODES_SHIFT
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.
+choice
+ prompt "RISC-V spinlock type"
+ default RISCV_TICKET_SPINLOCKS
+
+config RISCV_TICKET_SPINLOCKS
+ bool "Using ticket spinlock"
+
+config RISCV_QUEUED_SPINLOCKS
+ bool "Using queued spinlock"
+ depends on SMP && MMU
+ select ARCH_USE_QUEUED_SPINLOCKS
+ help
+ Make sure your micro arch LL/SC has a strong forward progress guarantee.
+ Otherwise, stay at ticket-lock.
+endchoice
+
config RISCV_ALTERNATIVE
bool
depends on !XIP_KERNEL
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 504f8b7e72d4..2cce98c7b653 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -2,7 +2,9 @@
generic-y += early_ioremap.h
generic-y += flat.h
generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
generic-y += parport.h
+generic-y += qspinlock.h
generic-y += spinlock.h
generic-y += spinlock_types.h
generic-y += qrwlock.h
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 4b5fa25f4336..2ba88057db52 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -11,12 +11,36 @@
#include <asm/barrier.h>
#include <asm/fence.h>
+static inline ulong __xchg16_relaxed(ulong new, void *ptr)
+{
+ ulong ret, tmp;
+ ulong shif = ((ulong)ptr & 2) ? 16 : 0;
+ ulong mask = 0xffff << shif;
+ ulong *__ptr = (ulong *)((ulong)ptr & ~2);
+
+ __asm__ __volatile__ (
+ "0: lr.w %0, %2\n"
+ " and %1, %0, %z3\n"
+ " or %1, %1, %z4\n"
+ " sc.w %1, %1, %2\n"
+ " bnez %1, 0b\n"
+ : "=&r" (ret), "=&r" (tmp), "+A" (*__ptr)
+ : "rJ" (~mask), "rJ" (new << shif)
+ : "memory");
+
+ return (ulong)((ret & mask) >> shif);
+}
+
#define __xchg_relaxed(ptr, new, size) \
({ \
__typeof__(ptr) __ptr = (ptr); \
__typeof__(new) __new = (new); \
__typeof__(*(ptr)) __ret; \
switch (size) { \
+ case 2: { \
+ __ret = (__typeof__(*(ptr))) \
+ __xchg16_relaxed((ulong)__new, __ptr); \
+ break;} \
case 4: \
__asm__ __volatile__ ( \
" amoswap.w %0, %2, %1\n" \
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 12/15] riscv: Add combo spinlock support
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (10 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 11/15] riscv: Add qspinlock support guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 13/15] openrisc: cmpxchg: Cleanup unnecessary codes guoren
` (3 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
Combo spinlock could support queued and ticket in one Linux Image and
select them during boot time with command line option. Here is the
func-size(Bytes) comparison table below:
TYPE : COMBO | TICKET | QUEUED
arch_spin_lock : 106 | 60 | 50
arch_spin_unlock : 54 | 36 | 26
arch_spin_trylock : 110 | 72 | 54
arch_spin_is_locked : 48 | 34 | 20
arch_spin_is_contended : 56 | 40 | 24
rch_spin_value_unlocked : 48 | 34 | 24
One example of disassemble combo arch_spin_unlock:
0xffffffff8000409c <+14>: nop # jump label slot
0xffffffff800040a0 <+18>: fence rw,w # queued spinlock start
0xffffffff800040a4 <+22>: sb zero,0(a4) # queued spinlock end
0xffffffff800040a8 <+26>: ld s0,8(sp)
0xffffffff800040aa <+28>: addi sp,sp,16
0xffffffff800040ac <+30>: ret
0xffffffff800040ae <+32>: lw a5,0(a4) # ticket spinlock start
0xffffffff800040b0 <+34>: sext.w a5,a5
0xffffffff800040b2 <+36>: fence rw,w
0xffffffff800040b6 <+40>: addiw a5,a5,1
0xffffffff800040b8 <+42>: slli a5,a5,0x30
0xffffffff800040ba <+44>: srli a5,a5,0x30
0xffffffff800040bc <+46>: sh a5,0(a4) # ticket spinlock end
0xffffffff800040c0 <+50>: ld s0,8(sp)
0xffffffff800040c2 <+52>: addi sp,sp,16
0xffffffff800040c4 <+54>: ret
The qspinlock is smaller and faster than ticket-lock when all is in
fast-path, and combo spinlock could provide a compatible Linux Image for
different micro-arch design (weak/strict fwd guarantee) processors.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/riscv/Kconfig | 9 +++-
arch/riscv/include/asm/Kbuild | 1 -
arch/riscv/include/asm/spinlock.h | 77 +++++++++++++++++++++++++++++++
arch/riscv/kernel/setup.c | 22 +++++++++
4 files changed, 107 insertions(+), 2 deletions(-)
create mode 100644 arch/riscv/include/asm/spinlock.h
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 8b36a4307d03..6645f04c7da4 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -361,7 +361,7 @@ config NODES_SHIFT
choice
prompt "RISC-V spinlock type"
- default RISCV_TICKET_SPINLOCKS
+ default RISCV_COMBO_SPINLOCKS
config RISCV_TICKET_SPINLOCKS
bool "Using ticket spinlock"
@@ -373,6 +373,13 @@ config RISCV_QUEUED_SPINLOCKS
help
Make sure your micro arch LL/SC has a strong forward progress guarantee.
Otherwise, stay at ticket-lock.
+
+config RISCV_COMBO_SPINLOCKS
+ bool "Using combo spinlock"
+ depends on SMP && MMU
+ select ARCH_USE_QUEUED_SPINLOCKS
+ help
+ Select queued spinlock or ticket-lock with jump_label.
endchoice
config RISCV_ALTERNATIVE
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 2cce98c7b653..59d5ea7390ea 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -5,7 +5,6 @@ generic-y += kvm_para.h
generic-y += mcs_spinlock.h
generic-y += parport.h
generic-y += qspinlock.h
-generic-y += spinlock.h
generic-y += spinlock_types.h
generic-y += qrwlock.h
generic-y += qrwlock_types.h
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
new file mode 100644
index 000000000000..b079462d818b
--- /dev/null
+++ b/arch/riscv/include/asm/spinlock.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_RISCV_SPINLOCK_H
+#define __ASM_RISCV_SPINLOCK_H
+
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+#include <asm-generic/ticket_spinlock.h>
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
+
+#include <asm-generic/qspinlock.h>
+#include <linux/jump_label.h>
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
+
+DECLARE_STATIC_KEY_TRUE(qspinlock_key);
+
+static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+ if (static_branch_likely(&qspinlock_key))
+ queued_spin_lock(lock);
+ else
+ ticket_spin_lock(lock);
+}
+
+static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
+{
+ if (static_branch_likely(&qspinlock_key))
+ return queued_spin_trylock(lock);
+ return ticket_spin_trylock(lock);
+}
+
+static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+ if (static_branch_likely(&qspinlock_key))
+ queued_spin_unlock(lock);
+ else
+ ticket_spin_unlock(lock);
+}
+
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+{
+ if (static_branch_likely(&qspinlock_key))
+ return queued_spin_value_unlocked(lock);
+ else
+ return ticket_spin_value_unlocked(lock);
+}
+
+static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
+{
+ if (static_branch_likely(&qspinlock_key))
+ return queued_spin_is_locked(lock);
+ return ticket_spin_is_locked(lock);
+}
+
+static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
+{
+ if (static_branch_likely(&qspinlock_key))
+ return queued_spin_is_contended(lock);
+ return ticket_spin_is_contended(lock);
+}
+#include <asm/qrwlock.h>
+#else
+#include <asm-generic/spinlock.h>
+#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
+
+#endif /* __ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index f0f36a4a0e9b..b763039bf49b 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -261,6 +261,13 @@ static void __init parse_dtb(void)
#endif
}
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+DEFINE_STATIC_KEY_TRUE_RO(qspinlock_key);
+EXPORT_SYMBOL(qspinlock_key);
+
+static bool qspinlock_flag __initdata = false;
+#endif
+
void __init setup_arch(char **cmdline_p)
{
parse_dtb();
@@ -295,10 +302,25 @@ void __init setup_arch(char **cmdline_p)
setup_smp();
#endif
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+ if (!qspinlock_flag)
+ static_branch_disable(&qspinlock_key);
+#endif
+
riscv_fill_hwcap();
apply_boot_alternatives();
}
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+static int __init enable_qspinlock(char *p)
+{
+ qspinlock_flag = true;
+
+ return 0;
+}
+early_param("qspinlock", enable_qspinlock);
+#endif
+
static int __init topology_init(void)
{
int i, ret;
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 13/15] openrisc: cmpxchg: Cleanup unnecessary codes
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (11 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 12/15] riscv: Add combo spinlock support guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 14/15] openrisc: Move from ticket-lock to qspinlock guoren
` (2 subsequent siblings)
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren,
Guo Ren, Jonas Bonn, Stefan Kristiansson
From: Guo Ren <guoren@linux.alibaba.com>
Remove cmpxchg_small and xchg_small, because it's unnecessary now, and
they break the forward guarantee for atomic operations.
Also Remove unnecessary __HAVE_ARCH_CMPXCHG.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
---
arch/openrisc/include/asm/cmpxchg.h | 167 +++++++++-------------------
1 file changed, 50 insertions(+), 117 deletions(-)
diff --git a/arch/openrisc/include/asm/cmpxchg.h b/arch/openrisc/include/asm/cmpxchg.h
index 79fd16162ccb..df83b33b5882 100644
--- a/arch/openrisc/include/asm/cmpxchg.h
+++ b/arch/openrisc/include/asm/cmpxchg.h
@@ -20,10 +20,8 @@
#include <linux/compiler.h>
#include <linux/types.h>
-#define __HAVE_ARCH_CMPXCHG 1
-
-static inline unsigned long cmpxchg_u32(volatile void *ptr,
- unsigned long old, unsigned long new)
+/* cmpxchg */
+static inline u32 cmpxchg32(volatile void *ptr, u32 old, u32 new)
{
__asm__ __volatile__(
"1: l.lwa %0, 0(%1) \n"
@@ -41,8 +39,33 @@ static inline unsigned long cmpxchg_u32(volatile void *ptr,
return old;
}
-static inline unsigned long xchg_u32(volatile void *ptr,
- unsigned long val)
+#define __cmpxchg(ptr, old, new, size) \
+({ \
+ __typeof__(ptr) __ptr = (ptr); \
+ __typeof__(*(ptr)) __old = (old); \
+ __typeof__(*(ptr)) __new = (new); \
+ __typeof__(*(ptr)) __ret; \
+ switch (size) { \
+ case 4: \
+ __ret = (__typeof__(*(ptr))) \
+ cmpxchg32(__ptr, (u32)__old, (u32)__new); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __ret; \
+})
+
+#define arch_cmpxchg(ptr, o, n) \
+({ \
+ __typeof__(*(ptr)) _o_ = (o); \
+ __typeof__(*(ptr)) _n_ = (n); \
+ (__typeof__(*(ptr))) __cmpxchg((ptr), \
+ _o_, _n_, sizeof(*(ptr))); \
+})
+
+/* xchg */
+static inline u32 xchg32(volatile void *ptr, u32 val)
{
__asm__ __volatile__(
"1: l.lwa %0, 0(%1) \n"
@@ -56,116 +79,26 @@ static inline unsigned long xchg_u32(volatile void *ptr,
return val;
}
-static inline u32 cmpxchg_small(volatile void *ptr, u32 old, u32 new,
- int size)
-{
- int off = (unsigned long)ptr % sizeof(u32);
- volatile u32 *p = ptr - off;
-#ifdef __BIG_ENDIAN
- int bitoff = (sizeof(u32) - size - off) * BITS_PER_BYTE;
-#else
- int bitoff = off * BITS_PER_BYTE;
-#endif
- u32 bitmask = ((0x1 << size * BITS_PER_BYTE) - 1) << bitoff;
- u32 load32, old32, new32;
- u32 ret;
-
- load32 = READ_ONCE(*p);
-
- while (true) {
- ret = (load32 & bitmask) >> bitoff;
- if (old != ret)
- return ret;
-
- old32 = (load32 & ~bitmask) | (old << bitoff);
- new32 = (load32 & ~bitmask) | (new << bitoff);
-
- /* Do 32 bit cmpxchg */
- load32 = cmpxchg_u32(p, old32, new32);
- if (load32 == old32)
- return old;
- }
-}
-
-/* xchg */
-
-static inline u32 xchg_small(volatile void *ptr, u32 x, int size)
-{
- int off = (unsigned long)ptr % sizeof(u32);
- volatile u32 *p = ptr - off;
-#ifdef __BIG_ENDIAN
- int bitoff = (sizeof(u32) - size - off) * BITS_PER_BYTE;
-#else
- int bitoff = off * BITS_PER_BYTE;
-#endif
- u32 bitmask = ((0x1 << size * BITS_PER_BYTE) - 1) << bitoff;
- u32 oldv, newv;
- u32 ret;
-
- do {
- oldv = READ_ONCE(*p);
- ret = (oldv & bitmask) >> bitoff;
- newv = (oldv & ~bitmask) | (x << bitoff);
- } while (cmpxchg_u32(p, oldv, newv) != oldv);
-
- return ret;
-}
-
-/*
- * This function doesn't exist, so you'll get a linker error
- * if something tries to do an invalid cmpxchg().
- */
-extern unsigned long __cmpxchg_called_with_bad_pointer(void)
- __compiletime_error("Bad argument size for cmpxchg");
-
-static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
- unsigned long new, int size)
-{
- switch (size) {
- case 1:
- case 2:
- return cmpxchg_small(ptr, old, new, size);
- case 4:
- return cmpxchg_u32(ptr, old, new);
- default:
- return __cmpxchg_called_with_bad_pointer();
- }
-}
-
-#define arch_cmpxchg(ptr, o, n) \
- ({ \
- (__typeof__(*(ptr))) __cmpxchg((ptr), \
- (unsigned long)(o), \
- (unsigned long)(n), \
- sizeof(*(ptr))); \
- })
-
-/*
- * This function doesn't exist, so you'll get a linker error if
- * something tries to do an invalidly-sized xchg().
- */
-extern unsigned long __xchg_called_with_bad_pointer(void)
- __compiletime_error("Bad argument size for xchg");
-
-static inline unsigned long __xchg(volatile void *ptr, unsigned long with,
- int size)
-{
- switch (size) {
- case 1:
- case 2:
- return xchg_small(ptr, with, size);
- case 4:
- return xchg_u32(ptr, with);
- default:
- return __xchg_called_with_bad_pointer();
- }
-}
-
-#define arch_xchg(ptr, with) \
- ({ \
- (__typeof__(*(ptr))) __xchg((ptr), \
- (unsigned long)(with), \
- sizeof(*(ptr))); \
- })
+#define __xchg(ptr, new, size) \
+({ \
+ __typeof__(ptr) __ptr = (ptr); \
+ __typeof__(new) __new = (new); \
+ __typeof__(*(ptr)) __ret; \
+ switch (size) { \
+ case 4: \
+ __ret = (__typeof__(*(ptr))) \
+ xchg32(__ptr, (u32)__new); \
+ break; \
+ default: \
+ BUILD_BUG(); \
+ } \
+ __ret; \
+})
+
+#define arch_xchg(ptr, x) \
+({ \
+ __typeof__(*(ptr)) _x_ = (x); \
+ (__typeof__(*(ptr))) __xchg((ptr), _x_, sizeof(*(ptr))); \
+})
#endif /* __ASM_OPENRISC_CMPXCHG_H */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 14/15] openrisc: Move from ticket-lock to qspinlock
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (12 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 13/15] openrisc: cmpxchg: Cleanup unnecessary codes guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:13 ` [PATCH V9 15/15] csky: spinlock: Use the generic header files guoren
2022-08-08 7:25 ` [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup Guo Ren
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren,
Guo Ren, Jonas Bonn, Stefan Kristiansson
From: Guo Ren <guoren@linux.alibaba.com>
Enable qspinlock by the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").
Openrisc only has "l.lwa/l.swa" for all atomic operations. That means
its ll/sc pair should be a strong atomic forward progress guarantee, or
all atomic operations may cause live lock. The ticket-lock needs
atomic_fetch_add well defined forward progress guarantees under
contention, and qspinlock needs xchg16 forward progress guarantees. The
atomic_fetch_add (l.lwa + add + l.swa) & xchg16 (l.lwa + and + or +
l.swa) have similar implementations, so they has the same forward
progress guarantees.
The qspinlock is smaller and faster than ticket-lock when all is in
fast-path. No reason keep openrisc in ticket-lock not qspinlock. Here is
the comparison between qspinlock and ticket-lock in fast-path code
sizes (bytes):
TYPE : TICKET | QUEUED
arch_spin_lock : 128 | 96
arch_spin_unlock : 56 | 44
arch_spin_trylock : 108 | 80
arch_spin_is_locked : 36 | 36
arch_spin_is_contended : 36 | 36
arch_spin_value_unlocked: 28 | 28
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
---
arch/openrisc/Kconfig | 1 +
arch/openrisc/include/asm/Kbuild | 2 ++
arch/openrisc/include/asm/cmpxchg.h | 25 +++++++++++++++++++++++++
3 files changed, 28 insertions(+)
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index c7f282f60f64..1652a6aac882 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -10,6 +10,7 @@ config OPENRISC
select ARCH_HAS_DMA_SET_UNCACHED
select ARCH_HAS_DMA_CLEAR_UNCACHED
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
+ select ARCH_USE_QUEUED_SPINLOCKS
select COMMON_CLK
select OF
select OF_EARLY_FLATTREE
diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild
index c8c99b554ca4..ad147fec50b4 100644
--- a/arch/openrisc/include/asm/Kbuild
+++ b/arch/openrisc/include/asm/Kbuild
@@ -2,6 +2,8 @@
generic-y += extable.h
generic-y += kvm_para.h
generic-y += parport.h
+generic-y += mcs_spinlock.h
+generic-y += qspinlock.h
generic-y += spinlock_types.h
generic-y += spinlock.h
generic-y += qrwlock_types.h
diff --git a/arch/openrisc/include/asm/cmpxchg.h b/arch/openrisc/include/asm/cmpxchg.h
index df83b33b5882..2d650b07a0f4 100644
--- a/arch/openrisc/include/asm/cmpxchg.h
+++ b/arch/openrisc/include/asm/cmpxchg.h
@@ -65,6 +65,27 @@ static inline u32 cmpxchg32(volatile void *ptr, u32 old, u32 new)
})
/* xchg */
+static inline u32 xchg16(volatile void *ptr, u32 val)
+{
+ u32 ret, tmp;
+ u32 shif = ((ulong)ptr & 2) ? 16 : 0;
+ u32 mask = 0xffff << shif;
+ u32 *__ptr = (u32 *)((ulong)ptr & ~2);
+
+ __asm__ __volatile__(
+ "1: l.lwa %0, 0(%2) \n"
+ " l.and %1, %0, %3 \n"
+ " l.or %1, %1, %4 \n"
+ " l.swa 0(%2), %1 \n"
+ " l.bnf 1b \n"
+ " l.nop \n"
+ : "=&r" (ret), "=&r" (tmp)
+ : "r"(__ptr), "r" (~mask), "r" (val << shif)
+ : "cc", "memory");
+
+ return (ret & mask) >> shif;
+}
+
static inline u32 xchg32(volatile void *ptr, u32 val)
{
__asm__ __volatile__(
@@ -85,6 +106,10 @@ static inline u32 xchg32(volatile void *ptr, u32 val)
__typeof__(new) __new = (new); \
__typeof__(*(ptr)) __ret; \
switch (size) { \
+ case 2: \
+ __ret = (__typeof__(*(ptr))) \
+ xchg16(__ptr, (u32)__new); \
+ break; \
case 4: \
__ret = (__typeof__(*(ptr))) \
xchg32(__ptr, (u32)__new); \
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH V9 15/15] csky: spinlock: Use the generic header files
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (13 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 14/15] openrisc: Move from ticket-lock to qspinlock guoren
@ 2022-08-08 7:13 ` guoren
2022-08-08 7:25 ` [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup Guo Ren
15 siblings, 0 replies; 17+ messages in thread
From: guoren @ 2022-08-08 7:13 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren, Guo Ren
From: Guo Ren <guoren@linux.alibaba.com>
There is no difference between csky and generic, so use the generic
header.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
arch/csky/include/asm/Kbuild | 2 ++
arch/csky/include/asm/spinlock.h | 12 ------------
arch/csky/include/asm/spinlock_types.h | 9 ---------
3 files changed, 2 insertions(+), 21 deletions(-)
delete mode 100644 arch/csky/include/asm/spinlock.h
delete mode 100644 arch/csky/include/asm/spinlock_types.h
diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild
index 1117c28cb7e8..c08050fc0cce 100644
--- a/arch/csky/include/asm/Kbuild
+++ b/arch/csky/include/asm/Kbuild
@@ -7,6 +7,8 @@ generic-y += mcs_spinlock.h
generic-y += qrwlock.h
generic-y += qrwlock_types.h
generic-y += qspinlock.h
+generic-y += spinlock_types.h
+generic-y += spinlock.h
generic-y += parport.h
generic-y += user.h
generic-y += vmlinux.lds.h
diff --git a/arch/csky/include/asm/spinlock.h b/arch/csky/include/asm/spinlock.h
deleted file mode 100644
index 83a2005341f5..000000000000
--- a/arch/csky/include/asm/spinlock.h
+++ /dev/null
@@ -1,12 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-#ifndef __ASM_CSKY_SPINLOCK_H
-#define __ASM_CSKY_SPINLOCK_H
-
-#include <asm/qspinlock.h>
-#include <asm/qrwlock.h>
-
-/* See include/linux/spinlock.h */
-#define smp_mb__after_spinlock() smp_mb()
-
-#endif /* __ASM_CSKY_SPINLOCK_H */
diff --git a/arch/csky/include/asm/spinlock_types.h b/arch/csky/include/asm/spinlock_types.h
deleted file mode 100644
index 75bdf3af80ba..000000000000
--- a/arch/csky/include/asm/spinlock_types.h
+++ /dev/null
@@ -1,9 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-#ifndef __ASM_CSKY_SPINLOCK_TYPES_H
-#define __ASM_CSKY_SPINLOCK_TYPES_H
-
-#include <asm-generic/qspinlock_types.h>
-#include <asm-generic/qrwlock_types.h>
-
-#endif /* __ASM_CSKY_SPINLOCK_TYPES_H */
--
2.36.1
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
` (14 preceding siblings ...)
2022-08-08 7:13 ` [PATCH V9 15/15] csky: spinlock: Use the generic header files guoren
@ 2022-08-08 7:25 ` Guo Ren
15 siblings, 0 replies; 17+ messages in thread
From: Guo Ren @ 2022-08-08 7:25 UTC (permalink / raw)
To: palmer, heiko, hch, arnd, peterz, will, boqun.feng, longman,
shorne, conor.dooley
Cc: linux-csky, linux-arch, linux-kernel, linux-riscv, Guo Ren
Sorry, here is the Changelog:
Changes in V9:
- Fixup xchg16 compile warning
- Keep ticket-lock the same semantic with qspinlock
- Remove unused xchg32 and xchg64
- Forbid arch_cmpxchg64 for 32-bit
- Add openrisc qspinlock support
Changes in V8:
- Coding convention ticket fixup
- Move combo spinlock into riscv and simply asm-generic/spinlock.h
- Fixup xchg16 with wrong return value
- Add csky qspinlock
- Add combo & qspinlock & ticket-lock comparison
- Clean up unnecessary riscv acquire and release definitions
- Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky
Changes in V7:
- Add combo spinlock (ticket & queued) support
- Rename ticket_spinlock.h
- Remove unnecessary atomic_read in ticket_spin_value_unlocked
Changes in V6:
- Fixup Clang compile problem Reported-by: kernel test robot
<lkp@intel.com>
- Cleanup asm-generic/spinlock.h
- Remove changelog in patch main comment part, suggested by
Conor.Dooley@microchip.com
- Remove "default y if NUMA" in Kconfig
Changes in V5:
- Update comment with RISC-V forward guarantee feature.
- Back to V3 direction and optimize asm code.
Changes in V4:
- Remove custom sub-word xchg implementation
- Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock
Changes in V3:
- Coding convention by Peter Zijlstra's advices
Changes in V2:
- Coding convention in cmpxchg.h
- Re-implement short xchg
- Remove char & cmpxchg implementations
On Mon, Aug 8, 2022 at 3:14 PM <guoren@kernel.org> wrote:
>
> From: Guo Ren <guoren@linux.alibaba.com>
>
> In this series:
> - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as RCsc)
> - Add qspinlock and combo-lock for riscv
> - Add qspinlock to openrisc
> - Use generic header in csky
> - Optimize cmpxchg & atomic code
>
> Enable qspinlock and meet the requirements mentioned in a8ad07e5240c9
> ("asm-generic: qspinlock: Indicate the use of mixed-size atomics").
>
> RISC-V LR/SC pairs could provide a strong/weak forward guarantee that
> depends on micro-architecture. And RISC-V ISA spec has given out
> several limitations to let hardware support strict forward guarantee
> (RISC-V User ISA - 8.3 Eventual Success of Store-Conditional
> Instructions).
>
> eg:
> Some riscv hardware such as BOOMv3 & XiangShan could provide strict &
> strong forward guarantee (The cache line would be kept in an exclusive
> state for Backoff cycles, and only this core's interrupt could break
> the LR/SC pair).
> Qemu riscv give a weak forward guarantee by wrong implementation
> currently [1].
>
> So we Add combo spinlock (ticket & queued) support for riscv. Thus different
> kinds of memory model micro-arch processors could use the same Image
>
> The first try of qspinlock for riscv was made in 2019.1 [2].
>
> [1] https://github.com/qemu/qemu/blob/master/target/riscv/insn_trans/trans_rva.c.inc
> [2] https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/#r
>
> Guo Ren (15):
> asm-generic: ticket-lock: Remove unnecessary atomic_read
> asm-generic: ticket-lock: Use the same struct definitions with qspinlock
> asm-generic: ticket-lock: Move into ticket_spinlock.h
> asm-generic: ticket-lock: Keep ticket-lock the same semantic with qspinlock
> asm-generic: spinlock: Add queued spinlock support in common header
> riscv: atomic: Clean up unnecessary acquire and release definitions
> riscv: cmpxchg: Remove xchg32 and xchg64
> riscv: cmpxchg: Forbid arch_cmpxchg64 for 32-bit
> riscv: cmpxchg: Optimize cmpxchg64
> riscv: Enable ARCH_INLINE_READ*/WRITE*/SPIN*
> riscv: Add qspinlock support
> riscv: Add combo spinlock support
> openrisc: cmpxchg: Cleanup unnecessary codes
> openrisc: Move from ticket-lock to qspinlock
> csky: spinlock: Use the generic header files
>
> arch/csky/include/asm/Kbuild | 2 +
> arch/csky/include/asm/spinlock.h | 12 --
> arch/csky/include/asm/spinlock_types.h | 9 --
> arch/openrisc/Kconfig | 1 +
> arch/openrisc/include/asm/Kbuild | 2 +
> arch/openrisc/include/asm/cmpxchg.h | 192 ++++++++++---------------
> arch/riscv/Kconfig | 49 +++++++
> arch/riscv/include/asm/Kbuild | 3 +-
> arch/riscv/include/asm/atomic.h | 19 ---
> arch/riscv/include/asm/cmpxchg.h | 177 +++++++----------------
> arch/riscv/include/asm/spinlock.h | 77 ++++++++++
> arch/riscv/kernel/setup.c | 22 +++
> include/asm-generic/spinlock.h | 94 ++----------
> include/asm-generic/spinlock_types.h | 12 +-
> include/asm-generic/ticket_spinlock.h | 93 ++++++++++++
> 15 files changed, 384 insertions(+), 380 deletions(-)
> delete mode 100644 arch/csky/include/asm/spinlock.h
> delete mode 100644 arch/csky/include/asm/spinlock_types.h
> create mode 100644 arch/riscv/include/asm/spinlock.h
> create mode 100644 include/asm-generic/ticket_spinlock.h
>
> --
> 2.36.1
>
--
Best Regards
Guo Ren
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2022-08-08 7:25 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-08 7:13 [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup guoren
2022-08-08 7:13 ` [PATCH V9 01/15] asm-generic: ticket-lock: Remove unnecessary atomic_read guoren
2022-08-08 7:13 ` [PATCH V9 02/15] asm-generic: ticket-lock: Use the same struct definitions with qspinlock guoren
2022-08-08 7:13 ` [PATCH V9 03/15] asm-generic: ticket-lock: Move into ticket_spinlock.h guoren
2022-08-08 7:13 ` [PATCH V9 04/15] asm-generic: ticket-lock: Keep ticket-lock the same semantic with qspinlock guoren
2022-08-08 7:13 ` [PATCH V9 05/15] asm-generic: spinlock: Add queued spinlock support in common header guoren
2022-08-08 7:13 ` [PATCH V9 06/15] riscv: atomic: Clean up unnecessary acquire and release definitions guoren
2022-08-08 7:13 ` [PATCH V9 07/15] riscv: cmpxchg: Remove xchg32 and xchg64 guoren
2022-08-08 7:13 ` [PATCH V9 08/15] riscv: cmpxchg: Forbid arch_cmpxchg64 for 32-bit guoren
2022-08-08 7:13 ` [PATCH V9 09/15] riscv: cmpxchg: Optimize cmpxchg64 guoren
2022-08-08 7:13 ` [PATCH V9 10/15] riscv: Enable ARCH_INLINE_READ*/WRITE*/SPIN* guoren
2022-08-08 7:13 ` [PATCH V9 11/15] riscv: Add qspinlock support guoren
2022-08-08 7:13 ` [PATCH V9 12/15] riscv: Add combo spinlock support guoren
2022-08-08 7:13 ` [PATCH V9 13/15] openrisc: cmpxchg: Cleanup unnecessary codes guoren
2022-08-08 7:13 ` [PATCH V9 14/15] openrisc: Move from ticket-lock to qspinlock guoren
2022-08-08 7:13 ` [PATCH V9 15/15] csky: spinlock: Use the generic header files guoren
2022-08-08 7:25 ` [PATCH V9 00/15] arch: Add qspinlock support and atomic cleanup Guo Ren
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).