All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V7 0/5] riscv: Add qspinlock support with combo style
@ 2022-06-28  8:17 ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Enable qspinlock and meet the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").

RISC-V LR/SC pairs could provide a strong/weak forward guarantee that
depends on micro-architecture. And RISC-V ISA spec has given out
several limitations to let hardware support strict forward guarantee
(RISC-V User ISA - 8.3 Eventual Success of Store-Conditional
Instructions):
We restricted the length of LR/SC loops to fit within 64 contiguous
instruction bytes in the base ISA to avoid undue restrictions on
instruction cache and TLB size and associativity. Similarly, we
disallowed other loads and stores within the loops to avoid restrictions
on data-cache associativity in simple implementations that track the
reservation within a private cache. The restrictions on branches and
jumps limit the time that can be spent in the sequence. Floating-point
operations and integer multiply/divide were disallowed to simplify the
operating system’s emulation of these instructions on implementations
lacking appropriate hardware support.
Software is not forbidden from using unconstrained LR/SC sequences, but
portable software must detect the case that the sequence repeatedly
fails, then fall back to an alternate code sequence that does not rely
on an unconstrained LR/SC sequence. Implementations are permitted to
unconditionally fail any unconstrained LR/SC sequence.

eg:
Some riscv hardware such as BOOMv3 & XiangShan could provide strict &
strong forward guarantee (The cache line would be kept in an exclusive
state for Backoff cycles, and only this core's interrupt could break
the LR/SC pair).
Qemu riscv give a weak forward guarantee by wrong implementation
currently [1].

Add combo spinlock (ticket & queued) support
Some architecture has a flexible requirement on the type of spinlock.
Some LL/SC architectures of ISA don't force micro-arch to give a strong
forward guarantee. Thus different kinds of memory model micro-arch would
come out in one ISA. The ticket lock is suitable for exclusive monitor
designed LL/SC micro-arch with limited cores and "!NUMA". The
queue-spinlock could deal with NUMA/large-scale scenarios with a strong
forward guarantee designed LL/SC micro-arch.

The first version of patch was made in 2019.1 [2].

[1] https://github.com/qemu/qemu/blob/master/target/riscv/insn_trans/trans_rva.c.inc
[2] https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/#r

Change V7:
 - Add combo spinlock (ticket & queued) support
 - Rename ticket_spinlock.h
 - Remove unnecessary atomic_read in ticket_spin_value_unlocked  

Change V6:
 - Fixup Clang compile problem Reported-by: kernel test robot
   <lkp@intel.com>
 - Cleanup asm-generic/spinlock.h
 - Remove changelog in patch main comment part, suggested by
   Conor.Dooley@microchip.com
 - Remove "default y if NUMA" in Kconfig

Change V5:
 - Update comment with RISC-V forward guarantee feature.
 - Back to V3 direction and optimize asm code.

Change V4:
 - Remove custom sub-word xchg implementation
 - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock

Change V3:
 - Coding convention by Peter Zijlstra's advices

Change V2:
 - Coding convention in cmpxchg.h
 - Re-implement short xchg
 - Remove char & cmpxchg implementations

Guo Ren (2):
  asm-generic: spinlock: Move qspinlock & ticket-lock into generic
    spinlock.h
  riscv: Add qspinlock support

Guo Ren (5):
  asm-generic: ticket-lock: Remove unnecessary atomic_read
  asm-generic: ticket-lock: Use the same struct definitions with qspinlock
  asm-generic: ticket-lock: Move into ticket_spinlock.h
  asm-generic: spinlock: Add combo spinlock (ticket & queued)
  riscv: Add qspinlock support

 arch/riscv/Kconfig                    |  9 +++
 arch/riscv/include/asm/Kbuild         |  2 +
 arch/riscv/include/asm/cmpxchg.h      | 17 +++++
 arch/riscv/kernel/setup.c             |  4 ++
 include/asm-generic/spinlock.h        | 81 +++++++++++++----------
 include/asm-generic/spinlock_types.h  | 12 +---
 include/asm-generic/ticket_spinlock.h | 92 +++++++++++++++++++++++++++
 kernel/locking/qspinlock.c            |  2 +
 8 files changed, 174 insertions(+), 45 deletions(-)
 create mode 100644 include/asm-generic/ticket_spinlock.h

-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH V7 0/5] riscv: Add qspinlock support with combo style
@ 2022-06-28  8:17 ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Enable qspinlock and meet the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").

RISC-V LR/SC pairs could provide a strong/weak forward guarantee that
depends on micro-architecture. And RISC-V ISA spec has given out
several limitations to let hardware support strict forward guarantee
(RISC-V User ISA - 8.3 Eventual Success of Store-Conditional
Instructions):
We restricted the length of LR/SC loops to fit within 64 contiguous
instruction bytes in the base ISA to avoid undue restrictions on
instruction cache and TLB size and associativity. Similarly, we
disallowed other loads and stores within the loops to avoid restrictions
on data-cache associativity in simple implementations that track the
reservation within a private cache. The restrictions on branches and
jumps limit the time that can be spent in the sequence. Floating-point
operations and integer multiply/divide were disallowed to simplify the
operating system’s emulation of these instructions on implementations
lacking appropriate hardware support.
Software is not forbidden from using unconstrained LR/SC sequences, but
portable software must detect the case that the sequence repeatedly
fails, then fall back to an alternate code sequence that does not rely
on an unconstrained LR/SC sequence. Implementations are permitted to
unconditionally fail any unconstrained LR/SC sequence.

eg:
Some riscv hardware such as BOOMv3 & XiangShan could provide strict &
strong forward guarantee (The cache line would be kept in an exclusive
state for Backoff cycles, and only this core's interrupt could break
the LR/SC pair).
Qemu riscv give a weak forward guarantee by wrong implementation
currently [1].

Add combo spinlock (ticket & queued) support
Some architecture has a flexible requirement on the type of spinlock.
Some LL/SC architectures of ISA don't force micro-arch to give a strong
forward guarantee. Thus different kinds of memory model micro-arch would
come out in one ISA. The ticket lock is suitable for exclusive monitor
designed LL/SC micro-arch with limited cores and "!NUMA". The
queue-spinlock could deal with NUMA/large-scale scenarios with a strong
forward guarantee designed LL/SC micro-arch.

The first version of patch was made in 2019.1 [2].

[1] https://github.com/qemu/qemu/blob/master/target/riscv/insn_trans/trans_rva.c.inc
[2] https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/#r

Change V7:
 - Add combo spinlock (ticket & queued) support
 - Rename ticket_spinlock.h
 - Remove unnecessary atomic_read in ticket_spin_value_unlocked  

Change V6:
 - Fixup Clang compile problem Reported-by: kernel test robot
   <lkp@intel.com>
 - Cleanup asm-generic/spinlock.h
 - Remove changelog in patch main comment part, suggested by
   Conor.Dooley@microchip.com
 - Remove "default y if NUMA" in Kconfig

Change V5:
 - Update comment with RISC-V forward guarantee feature.
 - Back to V3 direction and optimize asm code.

Change V4:
 - Remove custom sub-word xchg implementation
 - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock

Change V3:
 - Coding convention by Peter Zijlstra's advices

Change V2:
 - Coding convention in cmpxchg.h
 - Re-implement short xchg
 - Remove char & cmpxchg implementations

Guo Ren (2):
  asm-generic: spinlock: Move qspinlock & ticket-lock into generic
    spinlock.h
  riscv: Add qspinlock support

Guo Ren (5):
  asm-generic: ticket-lock: Remove unnecessary atomic_read
  asm-generic: ticket-lock: Use the same struct definitions with qspinlock
  asm-generic: ticket-lock: Move into ticket_spinlock.h
  asm-generic: spinlock: Add combo spinlock (ticket & queued)
  riscv: Add qspinlock support

 arch/riscv/Kconfig                    |  9 +++
 arch/riscv/include/asm/Kbuild         |  2 +
 arch/riscv/include/asm/cmpxchg.h      | 17 +++++
 arch/riscv/kernel/setup.c             |  4 ++
 include/asm-generic/spinlock.h        | 81 +++++++++++++----------
 include/asm-generic/spinlock_types.h  | 12 +---
 include/asm-generic/ticket_spinlock.h | 92 +++++++++++++++++++++++++++
 kernel/locking/qspinlock.c            |  2 +
 8 files changed, 174 insertions(+), 45 deletions(-)
 create mode 100644 include/asm-generic/ticket_spinlock.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-06-28  8:17 ` guoren
@ 2022-06-28  8:17   ` guoren
  -1 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
because the value has been in lock. This patch could prevent
arch_spin_value_unlocked contend spin_lock data again.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index fdfebcb050f4..f1e4fa100f5a 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	return !arch_spin_is_locked(&lock);
+	u32 val = lock.counter;
+
+	return ((val >> 16) == (val & 0xffff));
 }
 
 #include <asm/qrwlock.h>
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-06-28  8:17   ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
because the value has been in lock. This patch could prevent
arch_spin_value_unlocked contend spin_lock data again.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index fdfebcb050f4..f1e4fa100f5a 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	return !arch_spin_is_locked(&lock);
+	u32 val = lock.counter;
+
+	return ((val >> 16) == (val & 0xffff));
 }
 
 #include <asm/qrwlock.h>
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 2/5] asm-generic: ticket-lock: Use the same struct definitions with qspinlock
  2022-06-28  8:17 ` guoren
@ 2022-06-28  8:17   ` guoren
  -1 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Let ticket_lock use the same struct definitions with qspinlock, and then
we could move to combo spinlock (combine ticket & queue).

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h       | 16 ++++++++--------
 include/asm-generic/spinlock_types.h | 12 ++----------
 2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index f1e4fa100f5a..4caeb8cebe53 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -32,7 +32,7 @@
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	u32 val = atomic_fetch_add(1<<16, lock);
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
 	u16 ticket = val >> 16;
 
 	if (ticket == (u16)val)
@@ -46,45 +46,45 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 	 * have no outstanding writes due to the atomic_fetch_add() the extra
 	 * orderings are free.
 	 */
-	atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
 	smp_mb();
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
-	u32 old = atomic_read(lock);
+	u32 old = atomic_read(&lock->val);
 
 	if ((old >> 16) != (old & 0xffff))
 		return false;
 
-	return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
 	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	smp_store_release(ptr, (u16)val + 1);
 }
 
 static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	return ((val >> 16) != (val & 0xffff));
 }
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	return (s16)((val >> 16) - (val & 0xffff)) > 1;
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = lock.counter;
+	u32 val = lock.val.counter;
 
 	return ((val >> 16) == (val & 0xffff));
 }
diff --git a/include/asm-generic/spinlock_types.h b/include/asm-generic/spinlock_types.h
index 8962bb730945..f534aa5de394 100644
--- a/include/asm-generic/spinlock_types.h
+++ b/include/asm-generic/spinlock_types.h
@@ -3,15 +3,7 @@
 #ifndef __ASM_GENERIC_SPINLOCK_TYPES_H
 #define __ASM_GENERIC_SPINLOCK_TYPES_H
 
-#include <linux/types.h>
-typedef atomic_t arch_spinlock_t;
-
-/*
- * qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
- * include.
- */
-#include <asm/qrwlock_types.h>
-
-#define __ARCH_SPIN_LOCK_UNLOCKED	ATOMIC_INIT(0)
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
 
 #endif /* __ASM_GENERIC_SPINLOCK_TYPES_H */
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 2/5] asm-generic: ticket-lock: Use the same struct definitions with qspinlock
@ 2022-06-28  8:17   ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Let ticket_lock use the same struct definitions with qspinlock, and then
we could move to combo spinlock (combine ticket & queue).

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h       | 16 ++++++++--------
 include/asm-generic/spinlock_types.h | 12 ++----------
 2 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index f1e4fa100f5a..4caeb8cebe53 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -32,7 +32,7 @@
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	u32 val = atomic_fetch_add(1<<16, lock);
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
 	u16 ticket = val >> 16;
 
 	if (ticket == (u16)val)
@@ -46,45 +46,45 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 	 * have no outstanding writes due to the atomic_fetch_add() the extra
 	 * orderings are free.
 	 */
-	atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
 	smp_mb();
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
-	u32 old = atomic_read(lock);
+	u32 old = atomic_read(&lock->val);
 
 	if ((old >> 16) != (old & 0xffff))
 		return false;
 
-	return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
 	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	smp_store_release(ptr, (u16)val + 1);
 }
 
 static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	return ((val >> 16) != (val & 0xffff));
 }
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	return (s16)((val >> 16) - (val & 0xffff)) > 1;
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = lock.counter;
+	u32 val = lock.val.counter;
 
 	return ((val >> 16) == (val & 0xffff));
 }
diff --git a/include/asm-generic/spinlock_types.h b/include/asm-generic/spinlock_types.h
index 8962bb730945..f534aa5de394 100644
--- a/include/asm-generic/spinlock_types.h
+++ b/include/asm-generic/spinlock_types.h
@@ -3,15 +3,7 @@
 #ifndef __ASM_GENERIC_SPINLOCK_TYPES_H
 #define __ASM_GENERIC_SPINLOCK_TYPES_H
 
-#include <linux/types.h>
-typedef atomic_t arch_spinlock_t;
-
-/*
- * qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
- * include.
- */
-#include <asm/qrwlock_types.h>
-
-#define __ARCH_SPIN_LOCK_UNLOCKED	ATOMIC_INIT(0)
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
 
 #endif /* __ASM_GENERIC_SPINLOCK_TYPES_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 3/5] asm-generic: ticket-lock: Move into ticket_spinlock.h
  2022-06-28  8:17 ` guoren
@ 2022-06-28  8:17   ` guoren
  -1 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Move ticket-lock definition into an independent file. It's a preparation
patch for the following combo spinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h        | 44 ++-----------
 include/asm-generic/ticket_spinlock.h | 92 +++++++++++++++++++++++++++
 2 files changed, 99 insertions(+), 37 deletions(-)
 create mode 100644 include/asm-generic/ticket_spinlock.h

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 4caeb8cebe53..f41dc7c2b900 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -27,66 +27,36 @@
 #ifndef __ASM_GENERIC_SPINLOCK_H
 #define __ASM_GENERIC_SPINLOCK_H
 
-#include <linux/atomic.h>
-#include <asm-generic/spinlock_types.h>
+#include <asm-generic/ticket_spinlock.h>
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	u32 val = atomic_fetch_add(1<<16, &lock->val);
-	u16 ticket = val >> 16;
-
-	if (ticket == (u16)val)
-		return;
-
-	/*
-	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
-	 * custom cond_read_rcsc() here we just emit a full fence.  We only
-	 * need the prior reads before subsequent writes ordering from
-	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
-	 * have no outstanding writes due to the atomic_fetch_add() the extra
-	 * orderings are free.
-	 */
-	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
-	smp_mb();
+	ticket_spin_lock(lock);
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
-	u32 old = atomic_read(&lock->val);
-
-	if ((old >> 16) != (old & 0xffff))
-		return false;
-
-	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+	return ticket_spin_trylock(lock);
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
-	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(&lock->val);
-
-	smp_store_release(ptr, (u16)val + 1);
+	ticket_spin_unlock(lock);
 }
 
 static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(&lock->val);
-
-	return ((val >> 16) != (val & 0xffff));
+	return ticket_spin_is_locked(lock);
 }
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(&lock->val);
-
-	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+	return ticket_spin_is_contended(lock);
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = lock.val.counter;
-
-	return ((val >> 16) == (val & 0xffff));
+	return ticket_spin_value_unlocked(lock);
 }
 
 #include <asm/qrwlock.h>
diff --git a/include/asm-generic/ticket_spinlock.h b/include/asm-generic/ticket_spinlock.h
new file mode 100644
index 000000000000..83e769398eea
--- /dev/null
+++ b/include/asm-generic/ticket_spinlock.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
+ * a full fence after the spin to upgrade the otherwise-RCpc
+ * atomic_cond_read_acquire().
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
+ */
+
+#ifndef __ASM_GENERIC_TICKET_SPINLOCK_H
+#define __ASM_GENERIC_TICKET_SPINLOCK_H
+
+#include <linux/atomic.h>
+#include <asm-generic/spinlock_types.h>
+
+static __always_inline void ticket_spin_lock(arch_spinlock_t *lock)
+{
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
+	u16 ticket = val >> 16;
+
+	if (ticket == (u16)val)
+		return;
+
+	/*
+	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
+	 * custom cond_read_rcsc() here we just emit a full fence.  We only
+	 * need the prior reads before subsequent writes ordering from
+	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
+	 * have no outstanding writes due to the atomic_fetch_add() the extra
+	 * orderings are free.
+	 */
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
+	smp_mb();
+}
+
+static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
+{
+	u32 old = atomic_read(&lock->val);
+
+	if ((old >> 16) != (old & 0xffff))
+		return false;
+
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void ticket_spin_unlock(arch_spinlock_t *lock)
+{
+	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
+	u32 val = atomic_read(&lock->val);
+
+	smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int ticket_spin_is_locked(arch_spinlock_t *lock)
+{
+	u32 val = atomic_read(&lock->val);
+
+	return ((val >> 16) != (val & 0xffff));
+}
+
+static __always_inline int ticket_spin_is_contended(arch_spinlock_t *lock)
+{
+	u32 val = atomic_read(&lock->val);
+
+	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+static __always_inline int ticket_spin_value_unlocked(arch_spinlock_t lock)
+{
+	u32 val = lock.val.counter;
+
+	return ((val >> 16) == (val & 0xffff));
+}
+
+#endif /* __ASM_GENERIC_TICKET_SPINLOCK_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 3/5] asm-generic: ticket-lock: Move into ticket_spinlock.h
@ 2022-06-28  8:17   ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Move ticket-lock definition into an independent file. It's a preparation
patch for the following combo spinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h        | 44 ++-----------
 include/asm-generic/ticket_spinlock.h | 92 +++++++++++++++++++++++++++
 2 files changed, 99 insertions(+), 37 deletions(-)
 create mode 100644 include/asm-generic/ticket_spinlock.h

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 4caeb8cebe53..f41dc7c2b900 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -27,66 +27,36 @@
 #ifndef __ASM_GENERIC_SPINLOCK_H
 #define __ASM_GENERIC_SPINLOCK_H
 
-#include <linux/atomic.h>
-#include <asm-generic/spinlock_types.h>
+#include <asm-generic/ticket_spinlock.h>
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	u32 val = atomic_fetch_add(1<<16, &lock->val);
-	u16 ticket = val >> 16;
-
-	if (ticket == (u16)val)
-		return;
-
-	/*
-	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
-	 * custom cond_read_rcsc() here we just emit a full fence.  We only
-	 * need the prior reads before subsequent writes ordering from
-	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
-	 * have no outstanding writes due to the atomic_fetch_add() the extra
-	 * orderings are free.
-	 */
-	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
-	smp_mb();
+	ticket_spin_lock(lock);
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
-	u32 old = atomic_read(&lock->val);
-
-	if ((old >> 16) != (old & 0xffff))
-		return false;
-
-	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+	return ticket_spin_trylock(lock);
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
-	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(&lock->val);
-
-	smp_store_release(ptr, (u16)val + 1);
+	ticket_spin_unlock(lock);
 }
 
 static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(&lock->val);
-
-	return ((val >> 16) != (val & 0xffff));
+	return ticket_spin_is_locked(lock);
 }
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(&lock->val);
-
-	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+	return ticket_spin_is_contended(lock);
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = lock.val.counter;
-
-	return ((val >> 16) == (val & 0xffff));
+	return ticket_spin_value_unlocked(lock);
 }
 
 #include <asm/qrwlock.h>
diff --git a/include/asm-generic/ticket_spinlock.h b/include/asm-generic/ticket_spinlock.h
new file mode 100644
index 000000000000..83e769398eea
--- /dev/null
+++ b/include/asm-generic/ticket_spinlock.h
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
+ * a full fence after the spin to upgrade the otherwise-RCpc
+ * atomic_cond_read_acquire().
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
+ */
+
+#ifndef __ASM_GENERIC_TICKET_SPINLOCK_H
+#define __ASM_GENERIC_TICKET_SPINLOCK_H
+
+#include <linux/atomic.h>
+#include <asm-generic/spinlock_types.h>
+
+static __always_inline void ticket_spin_lock(arch_spinlock_t *lock)
+{
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
+	u16 ticket = val >> 16;
+
+	if (ticket == (u16)val)
+		return;
+
+	/*
+	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
+	 * custom cond_read_rcsc() here we just emit a full fence.  We only
+	 * need the prior reads before subsequent writes ordering from
+	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
+	 * have no outstanding writes due to the atomic_fetch_add() the extra
+	 * orderings are free.
+	 */
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
+	smp_mb();
+}
+
+static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
+{
+	u32 old = atomic_read(&lock->val);
+
+	if ((old >> 16) != (old & 0xffff))
+		return false;
+
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void ticket_spin_unlock(arch_spinlock_t *lock)
+{
+	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
+	u32 val = atomic_read(&lock->val);
+
+	smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int ticket_spin_is_locked(arch_spinlock_t *lock)
+{
+	u32 val = atomic_read(&lock->val);
+
+	return ((val >> 16) != (val & 0xffff));
+}
+
+static __always_inline int ticket_spin_is_contended(arch_spinlock_t *lock)
+{
+	u32 val = atomic_read(&lock->val);
+
+	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+static __always_inline int ticket_spin_value_unlocked(arch_spinlock_t lock)
+{
+	u32 val = lock.val.counter;
+
+	return ((val >> 16) == (val & 0xffff));
+}
+
+#endif /* __ASM_GENERIC_TICKET_SPINLOCK_H */
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-28  8:17 ` guoren
@ 2022-06-28  8:17   ` guoren
  -1 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Some architecture has a flexible requirement on the type of spinlock.
Some LL/SC architectures of ISA don't force micro-arch to give a strong
forward guarantee. Thus different kinds of memory model micro-arch would
come out in one ISA. The ticket lock is suitable for exclusive monitor
designed LL/SC micro-arch with limited cores and "!NUMA". The
queue-spinlock could deal with NUMA/large-scale scenarios with a strong
forward guarantee designed LL/SC micro-arch.

So, make the spinlock a combo with feature.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
 kernel/locking/qspinlock.c     |  2 ++
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index f41dc7c2b900..a9b43089bf99 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -28,34 +28,73 @@
 #define __ASM_GENERIC_SPINLOCK_H
 
 #include <asm-generic/ticket_spinlock.h>
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+#include <linux/jump_label.h>
+#include <asm-generic/qspinlock.h>
+
+DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
+#endif
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	ticket_spin_lock(lock);
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		queued_spin_lock(lock);
+	else
+#endif
+		ticket_spin_lock(lock);
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_trylock(lock);
+#endif
 	return ticket_spin_trylock(lock);
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
-	ticket_spin_unlock(lock);
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		queued_spin_unlock(lock);
+	else
+#endif
+		ticket_spin_unlock(lock);
 }
 
 static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_is_locked(lock);
+#endif
 	return ticket_spin_is_locked(lock);
 }
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_is_contended(lock);
+#endif
 	return ticket_spin_is_contended(lock);
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_value_unlocked(lock);
+#endif
 	return ticket_spin_value_unlocked(lock);
 }
 
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 65a9a10caa6f..b7f7436f42f6 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -566,6 +566,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 }
 EXPORT_SYMBOL(queued_spin_lock_slowpath);
 
+DEFINE_STATIC_KEY_TRUE_RO(use_qspinlock_key);
+
 /*
  * Generate the paravirt code for queued_spin_unlock_slowpath().
  */
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-28  8:17   ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Some architecture has a flexible requirement on the type of spinlock.
Some LL/SC architectures of ISA don't force micro-arch to give a strong
forward guarantee. Thus different kinds of memory model micro-arch would
come out in one ISA. The ticket lock is suitable for exclusive monitor
designed LL/SC micro-arch with limited cores and "!NUMA". The
queue-spinlock could deal with NUMA/large-scale scenarios with a strong
forward guarantee designed LL/SC micro-arch.

So, make the spinlock a combo with feature.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
 kernel/locking/qspinlock.c     |  2 ++
 2 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index f41dc7c2b900..a9b43089bf99 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -28,34 +28,73 @@
 #define __ASM_GENERIC_SPINLOCK_H
 
 #include <asm-generic/ticket_spinlock.h>
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+#include <linux/jump_label.h>
+#include <asm-generic/qspinlock.h>
+
+DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
+#endif
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	ticket_spin_lock(lock);
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		queued_spin_lock(lock);
+	else
+#endif
+		ticket_spin_lock(lock);
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_trylock(lock);
+#endif
 	return ticket_spin_trylock(lock);
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
-	ticket_spin_unlock(lock);
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		queued_spin_unlock(lock);
+	else
+#endif
+		ticket_spin_unlock(lock);
 }
 
 static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_is_locked(lock);
+#endif
 	return ticket_spin_is_locked(lock);
 }
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_is_contended(lock);
+#endif
 	return ticket_spin_is_contended(lock);
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
+#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
+	if (static_branch_likely(&use_qspinlock_key))
+		return queued_spin_value_unlocked(lock);
+#endif
 	return ticket_spin_value_unlocked(lock);
 }
 
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 65a9a10caa6f..b7f7436f42f6 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -566,6 +566,8 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 }
 EXPORT_SYMBOL(queued_spin_lock_slowpath);
 
+DEFINE_STATIC_KEY_TRUE_RO(use_qspinlock_key);
+
 /*
  * Generate the paravirt code for queued_spin_unlock_slowpath().
  */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 5/5] riscv: Add qspinlock support
  2022-06-28  8:17 ` guoren
@ 2022-06-28  8:17   ` guoren
  -1 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Enable qspinlock by the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").

 - RISC-V atomic_*_release()/atomic_*_acquire() are implemented with
   own relaxed version plus acquire/release_fence for RCsc
   synchronization.

 - RISC-V LR/SC pairs could provide a strong/weak forward guarantee
   that depends on micro-architecture. And RISC-V ISA spec has given
   out several limitations to let hardware support strict forward
   guarantee (RISC-V User ISA - 8.3 Eventual Success of
   Store-Conditional Instructions). Some riscv cores such as BOOMv3
   & XiangShan could provide strict & strong forward guarantee (The
   cache line would be kept in an exclusive state for Backoff cycles,
   and only this core's interrupt could break the LR/SC pair).

 - RISC-V provides cheap atomic_fetch_or_acquire() with RCsc.

 - RISC-V only provides relaxed xchg16 to support qspinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 arch/riscv/Kconfig               |  9 +++++++++
 arch/riscv/include/asm/Kbuild    |  2 ++
 arch/riscv/include/asm/cmpxchg.h | 17 +++++++++++++++++
 arch/riscv/kernel/setup.c        |  4 ++++
 4 files changed, 32 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 32ffef9f6e5b..47e12ab9c822 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -333,6 +333,15 @@ config NODES_SHIFT
 	  Specify the maximum number of NUMA Nodes available on the target
 	  system.  Increases memory reserved to accommodate various tables.
 
+config RISCV_USE_QUEUED_SPINLOCKS
+	bool "Using queued spinlock instead of ticket-lock"
+	depends on SMP && MMU
+	select ARCH_USE_QUEUED_SPINLOCKS
+	default y
+	help
+	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
+	  Otherwise, stay at ticket-lock.
+
 config RISCV_ALTERNATIVE
 	bool
 	depends on !XIP_KERNEL
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 504f8b7e72d4..2cce98c7b653 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -2,7 +2,9 @@
 generic-y += early_ioremap.h
 generic-y += flat.h
 generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
 generic-y += parport.h
+generic-y += qspinlock.h
 generic-y += spinlock.h
 generic-y += spinlock_types.h
 generic-y += qrwlock.h
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 12debce235e5..492104d45a23 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -17,6 +17,23 @@
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 2: { 							\
+		u32 temp;						\
+		u32 shif = ((ulong)__ptr & 2) ? 16 : 0;			\
+		u32 mask = 0xffff << shif;				\
+		__ptr = (__typeof__(ptr))((ulong)__ptr & ~(ulong)2);	\
+		__asm__ __volatile__ (					\
+			"0:	lr.w %0, %2\n"				\
+			"	and  %1, %0, %z3\n"			\
+			"	or   %1, %1, %z4\n"			\
+			"	sc.w %1, %1, %2\n"			\
+			"	bnez %1, 0b\n"				\
+			: "=&r" (__ret), "=&r" (temp), "+A" (*__ptr)	\
+			: "rJ" (~mask), "rJ" (__new << shif)		\
+			: "memory");					\
+		__ret = (__ret & mask) >> shif;				\
+		break;							\
+	}								\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w %0, %2, %1\n"			\
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index f0f36a4a0e9b..b9b234157a66 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -295,6 +295,10 @@ void __init setup_arch(char **cmdline_p)
 	setup_smp();
 #endif
 
+#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
+	static_branch_disable(&use_qspinlock_key);
+#endif
+
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 }
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH V7 5/5] riscv: Add qspinlock support
@ 2022-06-28  8:17   ` guoren
  0 siblings, 0 replies; 48+ messages in thread
From: guoren @ 2022-06-28  8:17 UTC (permalink / raw)
  To: palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Guo Ren, Peter Zijlstra

From: Guo Ren <guoren@linux.alibaba.com>

Enable qspinlock by the requirements mentioned in a8ad07e5240c9
("asm-generic: qspinlock: Indicate the use of mixed-size atomics").

 - RISC-V atomic_*_release()/atomic_*_acquire() are implemented with
   own relaxed version plus acquire/release_fence for RCsc
   synchronization.

 - RISC-V LR/SC pairs could provide a strong/weak forward guarantee
   that depends on micro-architecture. And RISC-V ISA spec has given
   out several limitations to let hardware support strict forward
   guarantee (RISC-V User ISA - 8.3 Eventual Success of
   Store-Conditional Instructions). Some riscv cores such as BOOMv3
   & XiangShan could provide strict & strong forward guarantee (The
   cache line would be kept in an exclusive state for Backoff cycles,
   and only this core's interrupt could break the LR/SC pair).

 - RISC-V provides cheap atomic_fetch_or_acquire() with RCsc.

 - RISC-V only provides relaxed xchg16 to support qspinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
---
 arch/riscv/Kconfig               |  9 +++++++++
 arch/riscv/include/asm/Kbuild    |  2 ++
 arch/riscv/include/asm/cmpxchg.h | 17 +++++++++++++++++
 arch/riscv/kernel/setup.c        |  4 ++++
 4 files changed, 32 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 32ffef9f6e5b..47e12ab9c822 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -333,6 +333,15 @@ config NODES_SHIFT
 	  Specify the maximum number of NUMA Nodes available on the target
 	  system.  Increases memory reserved to accommodate various tables.
 
+config RISCV_USE_QUEUED_SPINLOCKS
+	bool "Using queued spinlock instead of ticket-lock"
+	depends on SMP && MMU
+	select ARCH_USE_QUEUED_SPINLOCKS
+	default y
+	help
+	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
+	  Otherwise, stay at ticket-lock.
+
 config RISCV_ALTERNATIVE
 	bool
 	depends on !XIP_KERNEL
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 504f8b7e72d4..2cce98c7b653 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -2,7 +2,9 @@
 generic-y += early_ioremap.h
 generic-y += flat.h
 generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
 generic-y += parport.h
+generic-y += qspinlock.h
 generic-y += spinlock.h
 generic-y += spinlock_types.h
 generic-y += qrwlock.h
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 12debce235e5..492104d45a23 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -17,6 +17,23 @@
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 2: { 							\
+		u32 temp;						\
+		u32 shif = ((ulong)__ptr & 2) ? 16 : 0;			\
+		u32 mask = 0xffff << shif;				\
+		__ptr = (__typeof__(ptr))((ulong)__ptr & ~(ulong)2);	\
+		__asm__ __volatile__ (					\
+			"0:	lr.w %0, %2\n"				\
+			"	and  %1, %0, %z3\n"			\
+			"	or   %1, %1, %z4\n"			\
+			"	sc.w %1, %1, %2\n"			\
+			"	bnez %1, 0b\n"				\
+			: "=&r" (__ret), "=&r" (temp), "+A" (*__ptr)	\
+			: "rJ" (~mask), "rJ" (__new << shif)		\
+			: "memory");					\
+		__ret = (__ret & mask) >> shif;				\
+		break;							\
+	}								\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w %0, %2, %1\n"			\
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index f0f36a4a0e9b..b9b234157a66 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -295,6 +295,10 @@ void __init setup_arch(char **cmdline_p)
 	setup_smp();
 #endif
 
+#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
+	static_branch_disable(&use_qspinlock_key);
+#endif
+
 	riscv_fill_hwcap();
 	apply_boot_alternatives();
 }
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-06-28  8:17   ` guoren
@ 2022-06-28 18:05     ` Waiman Long
  -1 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-28 18:05 UTC (permalink / raw)
  To: guoren, palmer, arnd, mingo, will, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

On 6/28/22 04:17, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> because the value has been in lock. This patch could prevent
> arch_spin_value_unlocked contend spin_lock data again.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>   include/asm-generic/spinlock.h | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..f1e4fa100f5a 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>   
>   static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>   {
> -	return !arch_spin_is_locked(&lock);
> +	u32 val = lock.counter;
> +
> +	return ((val >> 16) == (val & 0xffff));
>   }
>   
>   #include <asm/qrwlock.h>

lockref.c is the only current user of arch_spin_value_unlocked(). This 
change is probably OK with this particular use case. Do you have any 
performance data about the improvement due to this change?

You may have to document that we have to revisit that if another use 
case shows up.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-06-28 18:05     ` Waiman Long
  0 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-28 18:05 UTC (permalink / raw)
  To: guoren, palmer, arnd, mingo, will, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

On 6/28/22 04:17, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> because the value has been in lock. This patch could prevent
> arch_spin_value_unlocked contend spin_lock data again.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>   include/asm-generic/spinlock.h | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..f1e4fa100f5a 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>   
>   static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>   {
> -	return !arch_spin_is_locked(&lock);
> +	u32 val = lock.counter;
> +
> +	return ((val >> 16) == (val & 0xffff));
>   }
>   
>   #include <asm/qrwlock.h>

lockref.c is the only current user of arch_spin_value_unlocked(). This 
change is probably OK with this particular use case. Do you have any 
performance data about the improvement due to this change?

You may have to document that we have to revisit that if another use 
case shows up.

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-28  8:17   ` guoren
@ 2022-06-28 18:13     ` Waiman Long
  -1 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-28 18:13 UTC (permalink / raw)
  To: guoren, palmer, arnd, mingo, will, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

On 6/28/22 04:17, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Some architecture has a flexible requirement on the type of spinlock.
> Some LL/SC architectures of ISA don't force micro-arch to give a strong
> forward guarantee. Thus different kinds of memory model micro-arch would
> come out in one ISA. The ticket lock is suitable for exclusive monitor
> designed LL/SC micro-arch with limited cores and "!NUMA". The
> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> forward guarantee designed LL/SC micro-arch.
>
> So, make the spinlock a combo with feature.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>   include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
>   kernel/locking/qspinlock.c     |  2 ++
>   2 files changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index f41dc7c2b900..a9b43089bf99 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -28,34 +28,73 @@
>   #define __ASM_GENERIC_SPINLOCK_H
>   
>   #include <asm-generic/ticket_spinlock.h>
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +#include <linux/jump_label.h>
> +#include <asm-generic/qspinlock.h>
> +
> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> +#endif
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
>   
>   static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>   {
> -	ticket_spin_lock(lock);
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		queued_spin_lock(lock);
> +	else
> +#endif
> +		ticket_spin_lock(lock);
>   }

Why do you use a static key to control whether to use qspinlock or 
ticket lock? In the next patch, you have

+#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
+	static_branch_disable(&use_qspinlock_key);
+#endif

So the current config setting determines if qspinlock will be used, not 
some boot time parameter that user needs to specify. This patch will 
just add useless code to lock/unlock sites. I don't see any benefit of 
doing that.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-28 18:13     ` Waiman Long
  0 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-28 18:13 UTC (permalink / raw)
  To: guoren, palmer, arnd, mingo, will, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

On 6/28/22 04:17, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Some architecture has a flexible requirement on the type of spinlock.
> Some LL/SC architectures of ISA don't force micro-arch to give a strong
> forward guarantee. Thus different kinds of memory model micro-arch would
> come out in one ISA. The ticket lock is suitable for exclusive monitor
> designed LL/SC micro-arch with limited cores and "!NUMA". The
> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> forward guarantee designed LL/SC micro-arch.
>
> So, make the spinlock a combo with feature.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>   include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
>   kernel/locking/qspinlock.c     |  2 ++
>   2 files changed, 43 insertions(+), 2 deletions(-)
>
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index f41dc7c2b900..a9b43089bf99 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -28,34 +28,73 @@
>   #define __ASM_GENERIC_SPINLOCK_H
>   
>   #include <asm-generic/ticket_spinlock.h>
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +#include <linux/jump_label.h>
> +#include <asm-generic/qspinlock.h>
> +
> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> +#endif
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
>   
>   static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>   {
> -	ticket_spin_lock(lock);
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		queued_spin_lock(lock);
> +	else
> +#endif
> +		ticket_spin_lock(lock);
>   }

Why do you use a static key to control whether to use qspinlock or 
ticket lock? In the next patch, you have

+#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
+	static_branch_disable(&use_qspinlock_key);
+#endif

So the current config setting determines if qspinlock will be used, not 
some boot time parameter that user needs to specify. This patch will 
just add useless code to lock/unlock sites. I don't see any benefit of 
doing that.

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-28 18:13     ` Waiman Long
@ 2022-06-29  1:17       ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  1:17 UTC (permalink / raw)
  To: Waiman Long
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/28/22 04:17, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Some architecture has a flexible requirement on the type of spinlock.
> > Some LL/SC architectures of ISA don't force micro-arch to give a strong
> > forward guarantee. Thus different kinds of memory model micro-arch would
> > come out in one ISA. The ticket lock is suitable for exclusive monitor
> > designed LL/SC micro-arch with limited cores and "!NUMA". The
> > queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> > forward guarantee designed LL/SC micro-arch.
> >
> > So, make the spinlock a combo with feature.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >   include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
> >   kernel/locking/qspinlock.c     |  2 ++
> >   2 files changed, 43 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index f41dc7c2b900..a9b43089bf99 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -28,34 +28,73 @@
> >   #define __ASM_GENERIC_SPINLOCK_H
> >
> >   #include <asm-generic/ticket_spinlock.h>
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +#include <linux/jump_label.h>
> > +#include <asm-generic/qspinlock.h>
> > +
> > +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> > +#endif
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> >
> >   static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> >   {
> > -     ticket_spin_lock(lock);
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             queued_spin_lock(lock);
> > +     else
> > +#endif
> > +             ticket_spin_lock(lock);
> >   }
>
> Why do you use a static key to control whether to use qspinlock or
> ticket lock? In the next patch, you have
>
> +#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
> +       static_branch_disable(&use_qspinlock_key);
> +#endif
>
> So the current config setting determines if qspinlock will be used, not
> some boot time parameter that user needs to specify. This patch will
> just add useless code to lock/unlock sites. I don't see any benefit of
> doing that.
This is a startup patch for riscv. next, we could let vendors make choices.
I'm not sure they like cmdline or vendor-specific errata style.

Eventually, we would let one riscv Image support all machines, some
use ticket-lock, and some use qspinlock.

>
> Cheers,
> Longman
>



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29  1:17       ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  1:17 UTC (permalink / raw)
  To: Waiman Long
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/28/22 04:17, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Some architecture has a flexible requirement on the type of spinlock.
> > Some LL/SC architectures of ISA don't force micro-arch to give a strong
> > forward guarantee. Thus different kinds of memory model micro-arch would
> > come out in one ISA. The ticket lock is suitable for exclusive monitor
> > designed LL/SC micro-arch with limited cores and "!NUMA". The
> > queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> > forward guarantee designed LL/SC micro-arch.
> >
> > So, make the spinlock a combo with feature.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >   include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
> >   kernel/locking/qspinlock.c     |  2 ++
> >   2 files changed, 43 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index f41dc7c2b900..a9b43089bf99 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -28,34 +28,73 @@
> >   #define __ASM_GENERIC_SPINLOCK_H
> >
> >   #include <asm-generic/ticket_spinlock.h>
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +#include <linux/jump_label.h>
> > +#include <asm-generic/qspinlock.h>
> > +
> > +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> > +#endif
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> >
> >   static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> >   {
> > -     ticket_spin_lock(lock);
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             queued_spin_lock(lock);
> > +     else
> > +#endif
> > +             ticket_spin_lock(lock);
> >   }
>
> Why do you use a static key to control whether to use qspinlock or
> ticket lock? In the next patch, you have
>
> +#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
> +       static_branch_disable(&use_qspinlock_key);
> +#endif
>
> So the current config setting determines if qspinlock will be used, not
> some boot time parameter that user needs to specify. This patch will
> just add useless code to lock/unlock sites. I don't see any benefit of
> doing that.
This is a startup patch for riscv. next, we could let vendors make choices.
I'm not sure they like cmdline or vendor-specific errata style.

Eventually, we would let one riscv Image support all machines, some
use ticket-lock, and some use qspinlock.

>
> Cheers,
> Longman
>



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  1:17       ` Guo Ren
@ 2022-06-29  1:34         ` Waiman Long
  -1 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-29  1:34 UTC (permalink / raw)
  To: Guo Ren
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On 6/28/22 21:17, Guo Ren wrote:
> On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
>> On 6/28/22 04:17, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> Some architecture has a flexible requirement on the type of spinlock.
>>> Some LL/SC architectures of ISA don't force micro-arch to give a strong
>>> forward guarantee. Thus different kinds of memory model micro-arch would
>>> come out in one ISA. The ticket lock is suitable for exclusive monitor
>>> designed LL/SC micro-arch with limited cores and "!NUMA". The
>>> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
>>> forward guarantee designed LL/SC micro-arch.
>>>
>>> So, make the spinlock a combo with feature.
>>>
>>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>>> Signed-off-by: Guo Ren <guoren@kernel.org>
>>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Palmer Dabbelt <palmer@rivosinc.com>
>>> ---
>>>    include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
>>>    kernel/locking/qspinlock.c     |  2 ++
>>>    2 files changed, 43 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
>>> index f41dc7c2b900..a9b43089bf99 100644
>>> --- a/include/asm-generic/spinlock.h
>>> +++ b/include/asm-generic/spinlock.h
>>> @@ -28,34 +28,73 @@
>>>    #define __ASM_GENERIC_SPINLOCK_H
>>>
>>>    #include <asm-generic/ticket_spinlock.h>
>>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
>>> +#include <linux/jump_label.h>
>>> +#include <asm-generic/qspinlock.h>
>>> +
>>> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
>>> +#endif
>>> +
>>> +#undef arch_spin_is_locked
>>> +#undef arch_spin_is_contended
>>> +#undef arch_spin_value_unlocked
>>> +#undef arch_spin_lock
>>> +#undef arch_spin_trylock
>>> +#undef arch_spin_unlock
>>>
>>>    static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>>>    {
>>> -     ticket_spin_lock(lock);
>>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
>>> +     if (static_branch_likely(&use_qspinlock_key))
>>> +             queued_spin_lock(lock);
>>> +     else
>>> +#endif
>>> +             ticket_spin_lock(lock);
>>>    }
>> Why do you use a static key to control whether to use qspinlock or
>> ticket lock? In the next patch, you have
>>
>> +#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
>> +       static_branch_disable(&use_qspinlock_key);
>> +#endif
>>
>> So the current config setting determines if qspinlock will be used, not
>> some boot time parameter that user needs to specify. This patch will
>> just add useless code to lock/unlock sites. I don't see any benefit of
>> doing that.
> This is a startup patch for riscv. next, we could let vendors make choices.
> I'm not sure they like cmdline or vendor-specific errata style.
>
> Eventually, we would let one riscv Image support all machines, some
> use ticket-lock, and some use qspinlock.

OK. Maybe you can postpone this combo spinlock until there is a good use 
case for it. Upstream usually don't accept patches that have no good use 
case yet.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29  1:34         ` Waiman Long
  0 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-29  1:34 UTC (permalink / raw)
  To: Guo Ren
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On 6/28/22 21:17, Guo Ren wrote:
> On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
>> On 6/28/22 04:17, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> Some architecture has a flexible requirement on the type of spinlock.
>>> Some LL/SC architectures of ISA don't force micro-arch to give a strong
>>> forward guarantee. Thus different kinds of memory model micro-arch would
>>> come out in one ISA. The ticket lock is suitable for exclusive monitor
>>> designed LL/SC micro-arch with limited cores and "!NUMA". The
>>> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
>>> forward guarantee designed LL/SC micro-arch.
>>>
>>> So, make the spinlock a combo with feature.
>>>
>>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>>> Signed-off-by: Guo Ren <guoren@kernel.org>
>>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
>>> Cc: Arnd Bergmann <arnd@arndb.de>
>>> Cc: Palmer Dabbelt <palmer@rivosinc.com>
>>> ---
>>>    include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
>>>    kernel/locking/qspinlock.c     |  2 ++
>>>    2 files changed, 43 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
>>> index f41dc7c2b900..a9b43089bf99 100644
>>> --- a/include/asm-generic/spinlock.h
>>> +++ b/include/asm-generic/spinlock.h
>>> @@ -28,34 +28,73 @@
>>>    #define __ASM_GENERIC_SPINLOCK_H
>>>
>>>    #include <asm-generic/ticket_spinlock.h>
>>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
>>> +#include <linux/jump_label.h>
>>> +#include <asm-generic/qspinlock.h>
>>> +
>>> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
>>> +#endif
>>> +
>>> +#undef arch_spin_is_locked
>>> +#undef arch_spin_is_contended
>>> +#undef arch_spin_value_unlocked
>>> +#undef arch_spin_lock
>>> +#undef arch_spin_trylock
>>> +#undef arch_spin_unlock
>>>
>>>    static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>>>    {
>>> -     ticket_spin_lock(lock);
>>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
>>> +     if (static_branch_likely(&use_qspinlock_key))
>>> +             queued_spin_lock(lock);
>>> +     else
>>> +#endif
>>> +             ticket_spin_lock(lock);
>>>    }
>> Why do you use a static key to control whether to use qspinlock or
>> ticket lock? In the next patch, you have
>>
>> +#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
>> +       static_branch_disable(&use_qspinlock_key);
>> +#endif
>>
>> So the current config setting determines if qspinlock will be used, not
>> some boot time parameter that user needs to specify. This patch will
>> just add useless code to lock/unlock sites. I don't see any benefit of
>> doing that.
> This is a startup patch for riscv. next, we could let vendors make choices.
> I'm not sure they like cmdline or vendor-specific errata style.
>
> Eventually, we would let one riscv Image support all machines, some
> use ticket-lock, and some use qspinlock.

OK. Maybe you can postpone this combo spinlock until there is a good use 
case for it. Upstream usually don't accept patches that have no good use 
case yet.

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-06-28 18:05     ` Waiman Long
@ 2022-06-29  2:12       ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  2:12 UTC (permalink / raw)
  To: Waiman Long
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 2:06 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/28/22 04:17, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> > because the value has been in lock. This patch could prevent
> > arch_spin_value_unlocked contend spin_lock data again.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >   include/asm-generic/spinlock.h | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index fdfebcb050f4..f1e4fa100f5a 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >
> >   static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >   {
> > -     return !arch_spin_is_locked(&lock);
> > +     u32 val = lock.counter;
> > +
> > +     return ((val >> 16) == (val & 0xffff));
> >   }
> >
> >   #include <asm/qrwlock.h>
>
> lockref.c is the only current user of arch_spin_value_unlocked(). This
> change is probably OK with this particular use case. Do you have any
> performance data about the improvement due to this change?
I don't have performance data and I just check the asm code, previous
version has an additional unnecessary atomic_read.

About this point, we've talked before, but I & palmer missed that
point when we pick peter's patch again.
https://lore.kernel.org/linux-riscv/YHbmXXvuG442ZDfN@hirez.programming.kicks-ass.net/
----
> > +static __always_inline int ticket_value_unlocked(arch_spinlock_t lock)
> > +{
> > +       return !ticket_is_locked(&lock);
> Are you sure to let ticket_is_locked->atomic_read(lock) again, the
> lock has contained all information?
>
> return lock.tickets.owner == lock.tickets.next;

Yeah, I wrote then the wrong way around. Couldn't be bothered to go back
when I figured it out.
---
It's just a small typo.


>
> You may have to document that we have to revisit that if another use
> case shows up.
>
> Cheers,
> Longman
>


--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-06-29  2:12       ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  2:12 UTC (permalink / raw)
  To: Waiman Long
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 2:06 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/28/22 04:17, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> > because the value has been in lock. This patch could prevent
> > arch_spin_value_unlocked contend spin_lock data again.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >   include/asm-generic/spinlock.h | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index fdfebcb050f4..f1e4fa100f5a 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >
> >   static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >   {
> > -     return !arch_spin_is_locked(&lock);
> > +     u32 val = lock.counter;
> > +
> > +     return ((val >> 16) == (val & 0xffff));
> >   }
> >
> >   #include <asm/qrwlock.h>
>
> lockref.c is the only current user of arch_spin_value_unlocked(). This
> change is probably OK with this particular use case. Do you have any
> performance data about the improvement due to this change?
I don't have performance data and I just check the asm code, previous
version has an additional unnecessary atomic_read.

About this point, we've talked before, but I & palmer missed that
point when we pick peter's patch again.
https://lore.kernel.org/linux-riscv/YHbmXXvuG442ZDfN@hirez.programming.kicks-ass.net/
----
> > +static __always_inline int ticket_value_unlocked(arch_spinlock_t lock)
> > +{
> > +       return !ticket_is_locked(&lock);
> Are you sure to let ticket_is_locked->atomic_read(lock) again, the
> lock has contained all information?
>
> return lock.tickets.owner == lock.tickets.next;

Yeah, I wrote then the wrong way around. Couldn't be bothered to go back
when I figured it out.
---
It's just a small typo.


>
> You may have to document that we have to revisit that if another use
> case shows up.
>
> Cheers,
> Longman
>


--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  1:34         ` Waiman Long
@ 2022-06-29  2:29           ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  2:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 9:34 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/28/22 21:17, Guo Ren wrote:
> > On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
> >> On 6/28/22 04:17, guoren@kernel.org wrote:
> >>> From: Guo Ren <guoren@linux.alibaba.com>
> >>>
> >>> Some architecture has a flexible requirement on the type of spinlock.
> >>> Some LL/SC architectures of ISA don't force micro-arch to give a strong
> >>> forward guarantee. Thus different kinds of memory model micro-arch would
> >>> come out in one ISA. The ticket lock is suitable for exclusive monitor
> >>> designed LL/SC micro-arch with limited cores and "!NUMA". The
> >>> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> >>> forward guarantee designed LL/SC micro-arch.
> >>>
> >>> So, make the spinlock a combo with feature.
> >>>
> >>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >>> Signed-off-by: Guo Ren <guoren@kernel.org>
> >>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> >>> Cc: Arnd Bergmann <arnd@arndb.de>
> >>> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >>> ---
> >>>    include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
> >>>    kernel/locking/qspinlock.c     |  2 ++
> >>>    2 files changed, 43 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> >>> index f41dc7c2b900..a9b43089bf99 100644
> >>> --- a/include/asm-generic/spinlock.h
> >>> +++ b/include/asm-generic/spinlock.h
> >>> @@ -28,34 +28,73 @@
> >>>    #define __ASM_GENERIC_SPINLOCK_H
> >>>
> >>>    #include <asm-generic/ticket_spinlock.h>
> >>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> >>> +#include <linux/jump_label.h>
> >>> +#include <asm-generic/qspinlock.h>
> >>> +
> >>> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> >>> +#endif
> >>> +
> >>> +#undef arch_spin_is_locked
> >>> +#undef arch_spin_is_contended
> >>> +#undef arch_spin_value_unlocked
> >>> +#undef arch_spin_lock
> >>> +#undef arch_spin_trylock
> >>> +#undef arch_spin_unlock
> >>>
> >>>    static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> >>>    {
> >>> -     ticket_spin_lock(lock);
> >>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> >>> +     if (static_branch_likely(&use_qspinlock_key))
> >>> +             queued_spin_lock(lock);
> >>> +     else
> >>> +#endif
> >>> +             ticket_spin_lock(lock);
> >>>    }
> >> Why do you use a static key to control whether to use qspinlock or
> >> ticket lock? In the next patch, you have
> >>
> >> +#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
> >> +       static_branch_disable(&use_qspinlock_key);
> >> +#endif
> >>
> >> So the current config setting determines if qspinlock will be used, not
> >> some boot time parameter that user needs to specify. This patch will
> >> just add useless code to lock/unlock sites. I don't see any benefit of
> >> doing that.
> > This is a startup patch for riscv. next, we could let vendors make choices.
> > I'm not sure they like cmdline or vendor-specific errata style.
> >
> > Eventually, we would let one riscv Image support all machines, some
> > use ticket-lock, and some use qspinlock.
>
> OK. Maybe you can postpone this combo spinlock until there is a good use
> case for it. Upstream usually don't accept patches that have no good use
> case yet.
>
> Cheers,
> Longman
>

I would add a cmdline to control the choice of qspinlock/ticket-lock
in the next version.

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index b9b234157a66..5ade490c2f27 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -270,6 +270,10 @@ void __init setup_arch(char **cmdline_p)

        early_ioremap_setup();
        jump_label_init();
+
+#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
+       static_branch_disable(&use_qspinlock_key);
+#endif
        parse_early_param();

        efi_init();
@@ -295,10 +299,6 @@ void __init setup_arch(char **cmdline_p)
        setup_smp();
 #endif

-#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
-       static_branch_disable(&use_qspinlock_key);
-#endif
-
        riscv_fill_hwcap();
        apply_boot_alternatives();
 }
@@ -330,3 +330,13 @@ void free_initmem(void)

        free_initmem_default(POISON_FREE_INITMEM);
 }
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+static int __init disable_qspinlock(char *p)
+{
+       static_branch_disable(&use_qspinlock_key);
+       return 0;
+}
+
+early_param("disable_qspinlock", disable_qspinlock);
+#endif


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29  2:29           ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  2:29 UTC (permalink / raw)
  To: Waiman Long
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 9:34 AM Waiman Long <longman@redhat.com> wrote:
>
> On 6/28/22 21:17, Guo Ren wrote:
> > On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
> >> On 6/28/22 04:17, guoren@kernel.org wrote:
> >>> From: Guo Ren <guoren@linux.alibaba.com>
> >>>
> >>> Some architecture has a flexible requirement on the type of spinlock.
> >>> Some LL/SC architectures of ISA don't force micro-arch to give a strong
> >>> forward guarantee. Thus different kinds of memory model micro-arch would
> >>> come out in one ISA. The ticket lock is suitable for exclusive monitor
> >>> designed LL/SC micro-arch with limited cores and "!NUMA". The
> >>> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> >>> forward guarantee designed LL/SC micro-arch.
> >>>
> >>> So, make the spinlock a combo with feature.
> >>>
> >>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >>> Signed-off-by: Guo Ren <guoren@kernel.org>
> >>> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> >>> Cc: Arnd Bergmann <arnd@arndb.de>
> >>> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> >>> ---
> >>>    include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
> >>>    kernel/locking/qspinlock.c     |  2 ++
> >>>    2 files changed, 43 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> >>> index f41dc7c2b900..a9b43089bf99 100644
> >>> --- a/include/asm-generic/spinlock.h
> >>> +++ b/include/asm-generic/spinlock.h
> >>> @@ -28,34 +28,73 @@
> >>>    #define __ASM_GENERIC_SPINLOCK_H
> >>>
> >>>    #include <asm-generic/ticket_spinlock.h>
> >>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> >>> +#include <linux/jump_label.h>
> >>> +#include <asm-generic/qspinlock.h>
> >>> +
> >>> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> >>> +#endif
> >>> +
> >>> +#undef arch_spin_is_locked
> >>> +#undef arch_spin_is_contended
> >>> +#undef arch_spin_value_unlocked
> >>> +#undef arch_spin_lock
> >>> +#undef arch_spin_trylock
> >>> +#undef arch_spin_unlock
> >>>
> >>>    static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> >>>    {
> >>> -     ticket_spin_lock(lock);
> >>> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> >>> +     if (static_branch_likely(&use_qspinlock_key))
> >>> +             queued_spin_lock(lock);
> >>> +     else
> >>> +#endif
> >>> +             ticket_spin_lock(lock);
> >>>    }
> >> Why do you use a static key to control whether to use qspinlock or
> >> ticket lock? In the next patch, you have
> >>
> >> +#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
> >> +       static_branch_disable(&use_qspinlock_key);
> >> +#endif
> >>
> >> So the current config setting determines if qspinlock will be used, not
> >> some boot time parameter that user needs to specify. This patch will
> >> just add useless code to lock/unlock sites. I don't see any benefit of
> >> doing that.
> > This is a startup patch for riscv. next, we could let vendors make choices.
> > I'm not sure they like cmdline or vendor-specific errata style.
> >
> > Eventually, we would let one riscv Image support all machines, some
> > use ticket-lock, and some use qspinlock.
>
> OK. Maybe you can postpone this combo spinlock until there is a good use
> case for it. Upstream usually don't accept patches that have no good use
> case yet.
>
> Cheers,
> Longman
>

I would add a cmdline to control the choice of qspinlock/ticket-lock
in the next version.

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index b9b234157a66..5ade490c2f27 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -270,6 +270,10 @@ void __init setup_arch(char **cmdline_p)

        early_ioremap_setup();
        jump_label_init();
+
+#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
+       static_branch_disable(&use_qspinlock_key);
+#endif
        parse_early_param();

        efi_init();
@@ -295,10 +299,6 @@ void __init setup_arch(char **cmdline_p)
        setup_smp();
 #endif

-#if !defined(CONFIG_NUMA) && defined(CONFIG_QUEUED_SPINLOCKS)
-       static_branch_disable(&use_qspinlock_key);
-#endif
-
        riscv_fill_hwcap();
        apply_boot_alternatives();
 }
@@ -330,3 +330,13 @@ void free_initmem(void)

        free_initmem_default(POISON_FREE_INITMEM);
 }
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+static int __init disable_qspinlock(char *p)
+{
+       static_branch_disable(&use_qspinlock_key);
+       return 0;
+}
+
+early_param("disable_qspinlock", disable_qspinlock);
+#endif


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  1:34         ` Waiman Long
@ 2022-06-29  7:08           ` Arnd Bergmann
  -1 siblings, 0 replies; 48+ messages in thread
From: Arnd Bergmann @ 2022-06-29  7:08 UTC (permalink / raw)
  To: Waiman Long
  Cc: Guo Ren, Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> On 6/28/22 21:17, Guo Ren wrote:
> > On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
> >> On 6/28/22 04:17, guoren@kernel.org wrote:
> >>
> >> So the current config setting determines if qspinlock will be used, not
> >> some boot time parameter that user needs to specify. This patch will
> >> just add useless code to lock/unlock sites. I don't see any benefit of
> >> doing that.
> > This is a startup patch for riscv. next, we could let vendors make choices.
> > I'm not sure they like cmdline or vendor-specific errata style.
> >
> > Eventually, we would let one riscv Image support all machines, some
> > use ticket-lock, and some use qspinlock.
>
> OK. Maybe you can postpone this combo spinlock until there is a good use
> case for it. Upstream usually don't accept patches that have no good use
> case yet.

I think the usecase on risc-v is this: there are cases where the qspinlock
is preferred for performance reasons, but there are also CPU cores on
which it is not safe to use. risc-v like most modern architectures has a
strict rule about being able to build kernels that work on all machines,
so without something like this, it would not be able to use qspinlock at all.

On the other hand, I don't really like the idea of putting the static-key
wrapper into the asm-generic header. Especially the ticket spinlock
implementation should be simple and not depend on jump labels.

From looking at the header file dependencies on arm64, I know that
putting jump labels into core infrastructure like the arch_spin_lock()
makes a big mess of indirect includes and measurably slows down
the kernel build.

I think this can still be done in the riscv asm/spinlock.h header with
minimal impact on the asm-generic file if the riscv maintainers see
a significant enough advantage, but I don't want it in the common code.

        Arnd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29  7:08           ` Arnd Bergmann
  0 siblings, 0 replies; 48+ messages in thread
From: Arnd Bergmann @ 2022-06-29  7:08 UTC (permalink / raw)
  To: Waiman Long
  Cc: Guo Ren, Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> On 6/28/22 21:17, Guo Ren wrote:
> > On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
> >> On 6/28/22 04:17, guoren@kernel.org wrote:
> >>
> >> So the current config setting determines if qspinlock will be used, not
> >> some boot time parameter that user needs to specify. This patch will
> >> just add useless code to lock/unlock sites. I don't see any benefit of
> >> doing that.
> > This is a startup patch for riscv. next, we could let vendors make choices.
> > I'm not sure they like cmdline or vendor-specific errata style.
> >
> > Eventually, we would let one riscv Image support all machines, some
> > use ticket-lock, and some use qspinlock.
>
> OK. Maybe you can postpone this combo spinlock until there is a good use
> case for it. Upstream usually don't accept patches that have no good use
> case yet.

I think the usecase on risc-v is this: there are cases where the qspinlock
is preferred for performance reasons, but there are also CPU cores on
which it is not safe to use. risc-v like most modern architectures has a
strict rule about being able to build kernels that work on all machines,
so without something like this, it would not be able to use qspinlock at all.

On the other hand, I don't really like the idea of putting the static-key
wrapper into the asm-generic header. Especially the ticket spinlock
implementation should be simple and not depend on jump labels.

From looking at the header file dependencies on arm64, I know that
putting jump labels into core infrastructure like the arch_spin_lock()
makes a big mess of indirect includes and measurably slows down
the kernel build.

I think this can still be done in the riscv asm/spinlock.h header with
minimal impact on the asm-generic file if the riscv maintainers see
a significant enough advantage, but I don't want it in the common code.

        Arnd

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  7:08           ` Arnd Bergmann
@ 2022-06-29  8:24             ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  8:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Waiman Long, Palmer Dabbelt, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 3:09 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> > On 6/28/22 21:17, Guo Ren wrote:
> > > On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
> > >> On 6/28/22 04:17, guoren@kernel.org wrote:
> > >>
> > >> So the current config setting determines if qspinlock will be used, not
> > >> some boot time parameter that user needs to specify. This patch will
> > >> just add useless code to lock/unlock sites. I don't see any benefit of
> > >> doing that.
> > > This is a startup patch for riscv. next, we could let vendors make choices.
> > > I'm not sure they like cmdline or vendor-specific errata style.
> > >
> > > Eventually, we would let one riscv Image support all machines, some
> > > use ticket-lock, and some use qspinlock.
> >
> > OK. Maybe you can postpone this combo spinlock until there is a good use
> > case for it. Upstream usually don't accept patches that have no good use
> > case yet.
>
> I think the usecase on risc-v is this: there are cases where the qspinlock
> is preferred for performance reasons, but there are also CPU cores on
> which it is not safe to use. risc-v like most modern architectures has a
> strict rule about being able to build kernels that work on all machines,
> so without something like this, it would not be able to use qspinlock at all.
>
> On the other hand, I don't really like the idea of putting the static-key
> wrapper into the asm-generic header. Especially the ticket spinlock
> implementation should be simple and not depend on jump labels.
If CONFIG_ARCH_USE_QUEUED_SPINLOCKS is not enabled, the patch still
will keep the ticket-lock simple without jump labels.

>
> From looking at the header file dependencies on arm64, I know that
> putting jump labels into core infrastructure like the arch_spin_lock()
> makes a big mess of indirect includes and measurably slows down
> the kernel build.
arm64 needn't combo spinlock, it could use pure qspinlock with keeping
current header files included.

>
> I think this can still be done in the riscv asm/spinlock.h header with
> minimal impact on the asm-generic file if the riscv maintainers see
> a significant enough advantage, but I don't want it in the common code.
Yes, it could. I agree with using combo spinlock only with riscv.

>
>         Arnd



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29  8:24             ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-06-29  8:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Waiman Long, Palmer Dabbelt, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 3:09 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> > On 6/28/22 21:17, Guo Ren wrote:
> > > On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
> > >> On 6/28/22 04:17, guoren@kernel.org wrote:
> > >>
> > >> So the current config setting determines if qspinlock will be used, not
> > >> some boot time parameter that user needs to specify. This patch will
> > >> just add useless code to lock/unlock sites. I don't see any benefit of
> > >> doing that.
> > > This is a startup patch for riscv. next, we could let vendors make choices.
> > > I'm not sure they like cmdline or vendor-specific errata style.
> > >
> > > Eventually, we would let one riscv Image support all machines, some
> > > use ticket-lock, and some use qspinlock.
> >
> > OK. Maybe you can postpone this combo spinlock until there is a good use
> > case for it. Upstream usually don't accept patches that have no good use
> > case yet.
>
> I think the usecase on risc-v is this: there are cases where the qspinlock
> is preferred for performance reasons, but there are also CPU cores on
> which it is not safe to use. risc-v like most modern architectures has a
> strict rule about being able to build kernels that work on all machines,
> so without something like this, it would not be able to use qspinlock at all.
>
> On the other hand, I don't really like the idea of putting the static-key
> wrapper into the asm-generic header. Especially the ticket spinlock
> implementation should be simple and not depend on jump labels.
If CONFIG_ARCH_USE_QUEUED_SPINLOCKS is not enabled, the patch still
will keep the ticket-lock simple without jump labels.

>
> From looking at the header file dependencies on arm64, I know that
> putting jump labels into core infrastructure like the arch_spin_lock()
> makes a big mess of indirect includes and measurably slows down
> the kernel build.
arm64 needn't combo spinlock, it could use pure qspinlock with keeping
current header files included.

>
> I think this can still be done in the riscv asm/spinlock.h header with
> minimal impact on the asm-generic file if the riscv maintainers see
> a significant enough advantage, but I don't want it in the common code.
Yes, it could. I agree with using combo spinlock only with riscv.

>
>         Arnd



-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-06-28  8:17   ` guoren
@ 2022-06-29  8:27     ` David Laight
  -1 siblings, 0 replies; 48+ messages in thread
From: David Laight @ 2022-06-29  8:27 UTC (permalink / raw)
  To: 'guoren@kernel.org',
	palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

From: guoren@kernel.org
> Sent: 28 June 2022 09:17
> 
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> because the value has been in lock. This patch could prevent
> arch_spin_value_unlocked contend spin_lock data again.

I'm confused (as usual).
Isn't atomic_read() pretty much free?

..
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..f1e4fa100f5a 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> 
>  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> -	return !arch_spin_is_locked(&lock);
> +	u32 val = lock.counter;
> +
> +	return ((val >> 16) == (val & 0xffff));

That almost certainly needs a READ_ONCE().

The result is also inherently stale.
So the uses must be pretty limited.

	David

>  }
> 
>  #include <asm/qrwlock.h>
> --
> 2.36.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 48+ messages in thread

* RE: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-06-29  8:27     ` David Laight
  0 siblings, 0 replies; 48+ messages in thread
From: David Laight @ 2022-06-29  8:27 UTC (permalink / raw)
  To: 'guoren@kernel.org',
	palmer, arnd, mingo, will, longman, boqun.feng
  Cc: linux-riscv, linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

From: guoren@kernel.org
> Sent: 28 June 2022 09:17
> 
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> because the value has been in lock. This patch could prevent
> arch_spin_value_unlocked contend spin_lock data again.

I'm confused (as usual).
Isn't atomic_read() pretty much free?

..
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..f1e4fa100f5a 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> 
>  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> -	return !arch_spin_is_locked(&lock);
> +	u32 val = lock.counter;
> +
> +	return ((val >> 16) == (val & 0xffff));

That almost certainly needs a READ_ONCE().

The result is also inherently stale.
So the uses must be pretty limited.

	David

>  }
> 
>  #include <asm/qrwlock.h>
> --
> 2.36.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  8:24             ` Guo Ren
@ 2022-06-29  8:29               ` Arnd Bergmann
  -1 siblings, 0 replies; 48+ messages in thread
From: Arnd Bergmann @ 2022-06-29  8:29 UTC (permalink / raw)
  To: Guo Ren
  Cc: Arnd Bergmann, Waiman Long, Palmer Dabbelt, Ingo Molnar,
	Will Deacon, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 10:24 AM Guo Ren <guoren@kernel.org> wrote:
> On Wed, Jun 29, 2022 at 3:09 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> >
> > From looking at the header file dependencies on arm64, I know that
> > putting jump labels into core infrastructure like the arch_spin_lock()
> > makes a big mess of indirect includes and measurably slows down
> > the kernel build.
> arm64 needn't combo spinlock, it could use pure qspinlock with keeping
> current header files included.

arm64 has a different problem: there are two separate sets of atomic
instructions, and the decision between those is similarly done using
jump labels. I definitely like the ability to choose between qspinlock
and ticket spinlock on arm64 as well. This can be done as a
compile-time choice, but both of them still depend on jump labels.

        Arnd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29  8:29               ` Arnd Bergmann
  0 siblings, 0 replies; 48+ messages in thread
From: Arnd Bergmann @ 2022-06-29  8:29 UTC (permalink / raw)
  To: Guo Ren
  Cc: Arnd Bergmann, Waiman Long, Palmer Dabbelt, Ingo Molnar,
	Will Deacon, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 10:24 AM Guo Ren <guoren@kernel.org> wrote:
> On Wed, Jun 29, 2022 at 3:09 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> >
> > From looking at the header file dependencies on arm64, I know that
> > putting jump labels into core infrastructure like the arch_spin_lock()
> > makes a big mess of indirect includes and measurably slows down
> > the kernel build.
> arm64 needn't combo spinlock, it could use pure qspinlock with keeping
> current header files included.

arm64 has a different problem: there are two separate sets of atomic
instructions, and the decision between those is similarly done using
jump labels. I definitely like the ability to choose between qspinlock
and ticket spinlock on arm64 as well. This can be done as a
compile-time choice, but both of them still depend on jump labels.

        Arnd

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  7:08           ` Arnd Bergmann
@ 2022-06-29 12:53             ` Waiman Long
  -1 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-29 12:53 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Guo Ren, Palmer Dabbelt, Ingo Molnar, Will Deacon, Boqun Feng,
	linux-riscv, linux-arch, Linux Kernel Mailing List, Guo Ren,
	Peter Zijlstra

On 6/29/22 03:08, Arnd Bergmann wrote:
> On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
>> On 6/28/22 21:17, Guo Ren wrote:
>>> On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
>>>> On 6/28/22 04:17, guoren@kernel.org wrote:
>>>>
>>>> So the current config setting determines if qspinlock will be used, not
>>>> some boot time parameter that user needs to specify. This patch will
>>>> just add useless code to lock/unlock sites. I don't see any benefit of
>>>> doing that.
>>> This is a startup patch for riscv. next, we could let vendors make choices.
>>> I'm not sure they like cmdline or vendor-specific errata style.
>>>
>>> Eventually, we would let one riscv Image support all machines, some
>>> use ticket-lock, and some use qspinlock.
>> OK. Maybe you can postpone this combo spinlock until there is a good use
>> case for it. Upstream usually don't accept patches that have no good use
>> case yet.
> I think the usecase on risc-v is this: there are cases where the qspinlock
> is preferred for performance reasons, but there are also CPU cores on
> which it is not safe to use. risc-v like most modern architectures has a
> strict rule about being able to build kernels that work on all machines,
> so without something like this, it would not be able to use qspinlock at all.

My objection for the current patch is really on the fact that everything 
is determined at compiled time. So there is no point to use static key 
if it cannot be changed at the boot time. Adding a boot time switch do 
make the use of static key more reasonable.


>
> On the other hand, I don't really like the idea of putting the static-key
> wrapper into the asm-generic header. Especially the ticket spinlock
> implementation should be simple and not depend on jump labels.
>
>  From looking at the header file dependencies on arm64, I know that
> putting jump labels into core infrastructure like the arch_spin_lock()
> makes a big mess of indirect includes and measurably slows down
> the kernel build.
>
> I think this can still be done in the riscv asm/spinlock.h header with
> minimal impact on the asm-generic file if the riscv maintainers see
> a significant enough advantage, but I don't want it in the common code.

I have a similar feeling. In addition, I don't like the idea of adding a 
static key to qspinlock.c that have nothing to do with the qspinlock 
logic. I would like to see it put elsewhere.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-06-29 12:53             ` Waiman Long
  0 siblings, 0 replies; 48+ messages in thread
From: Waiman Long @ 2022-06-29 12:53 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Guo Ren, Palmer Dabbelt, Ingo Molnar, Will Deacon, Boqun Feng,
	linux-riscv, linux-arch, Linux Kernel Mailing List, Guo Ren,
	Peter Zijlstra

On 6/29/22 03:08, Arnd Bergmann wrote:
> On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
>> On 6/28/22 21:17, Guo Ren wrote:
>>> On Wed, Jun 29, 2022 at 2:13 AM Waiman Long <longman@redhat.com> wrote:
>>>> On 6/28/22 04:17, guoren@kernel.org wrote:
>>>>
>>>> So the current config setting determines if qspinlock will be used, not
>>>> some boot time parameter that user needs to specify. This patch will
>>>> just add useless code to lock/unlock sites. I don't see any benefit of
>>>> doing that.
>>> This is a startup patch for riscv. next, we could let vendors make choices.
>>> I'm not sure they like cmdline or vendor-specific errata style.
>>>
>>> Eventually, we would let one riscv Image support all machines, some
>>> use ticket-lock, and some use qspinlock.
>> OK. Maybe you can postpone this combo spinlock until there is a good use
>> case for it. Upstream usually don't accept patches that have no good use
>> case yet.
> I think the usecase on risc-v is this: there are cases where the qspinlock
> is preferred for performance reasons, but there are also CPU cores on
> which it is not safe to use. risc-v like most modern architectures has a
> strict rule about being able to build kernels that work on all machines,
> so without something like this, it would not be able to use qspinlock at all.

My objection for the current patch is really on the fact that everything 
is determined at compiled time. So there is no point to use static key 
if it cannot be changed at the boot time. Adding a boot time switch do 
make the use of static key more reasonable.


>
> On the other hand, I don't really like the idea of putting the static-key
> wrapper into the asm-generic header. Especially the ticket spinlock
> implementation should be simple and not depend on jump labels.
>
>  From looking at the header file dependencies on arm64, I know that
> putting jump labels into core infrastructure like the arch_spin_lock()
> makes a big mess of indirect includes and measurably slows down
> the kernel build.
>
> I think this can still be done in the riscv asm/spinlock.h header with
> minimal impact on the asm-generic file if the riscv maintainers see
> a significant enough advantage, but I don't want it in the common code.

I have a similar feeling. In addition, I don't like the idea of adding a 
static key to qspinlock.c that have nothing to do with the qspinlock 
logic. I would like to see it put elsewhere.

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-29  8:29               ` Arnd Bergmann
@ 2022-07-01 12:18                 ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-01 12:18 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Waiman Long, Palmer Dabbelt, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 4:30 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Jun 29, 2022 at 10:24 AM Guo Ren <guoren@kernel.org> wrote:
> > On Wed, Jun 29, 2022 at 3:09 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> > >
> > > From looking at the header file dependencies on arm64, I know that
> > > putting jump labels into core infrastructure like the arch_spin_lock()
> > > makes a big mess of indirect includes and measurably slows down
> > > the kernel build.
> > arm64 needn't combo spinlock, it could use pure qspinlock with keeping
> > current header files included.
>
> arm64 has a different problem: there are two separate sets of atomic
> instructions, and the decision between those is similarly done using
> jump labels. I definitely like the ability to choose between qspinlock
> and ticket spinlock on arm64 as well. This can be done as a
> compile-time choice, but both of them still depend on jump labels.
1. xchg use ALTERNATIVE, but cmpxchg to jump labels.
2. arm64 is still using qspinlock when ll/sc, and I think they give
strong enough fwd guarantee with "prfm pstl1strm".
But another question is if ll/sc could give enough strong fwd
guarantee, why arm64 introduce LSE, for code size reduction? Why
instructions fusion technology is not enough?


>
>         Arnd



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-07-01 12:18                 ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-01 12:18 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Waiman Long, Palmer Dabbelt, Ingo Molnar, Will Deacon,
	Boqun Feng, linux-riscv, linux-arch, Linux Kernel Mailing List,
	Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 4:30 PM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Wed, Jun 29, 2022 at 10:24 AM Guo Ren <guoren@kernel.org> wrote:
> > On Wed, Jun 29, 2022 at 3:09 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Wed, Jun 29, 2022 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
> > >
> > > From looking at the header file dependencies on arm64, I know that
> > > putting jump labels into core infrastructure like the arch_spin_lock()
> > > makes a big mess of indirect includes and measurably slows down
> > > the kernel build.
> > arm64 needn't combo spinlock, it could use pure qspinlock with keeping
> > current header files included.
>
> arm64 has a different problem: there are two separate sets of atomic
> instructions, and the decision between those is similarly done using
> jump labels. I definitely like the ability to choose between qspinlock
> and ticket spinlock on arm64 as well. This can be done as a
> compile-time choice, but both of them still depend on jump labels.
1. xchg use ALTERNATIVE, but cmpxchg to jump labels.
2. arm64 is still using qspinlock when ll/sc, and I think they give
strong enough fwd guarantee with "prfm pstl1strm".
But another question is if ll/sc could give enough strong fwd
guarantee, why arm64 introduce LSE, for code size reduction? Why
instructions fusion technology is not enough?


>
>         Arnd



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-06-29  8:27     ` David Laight
@ 2022-07-01 15:18       ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-01 15:18 UTC (permalink / raw)
  To: David Laight
  Cc: palmer, arnd, mingo, will, longman, boqun.feng, linux-riscv,
	linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 4:27 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: guoren@kernel.org
> > Sent: 28 June 2022 09:17
> >
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> > because the value has been in lock. This patch could prevent
> > arch_spin_value_unlocked contend spin_lock data again.
>
> I'm confused (as usual).
> Isn't atomic_read() pretty much free?
When a cache line is shared with multi-harts, not as free as you think.
Preventing touching contended data is the basic principle.

atomic_read in alpha is heavy, It could be a potential user of ticket-lock.

>
> ..
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index fdfebcb050f4..f1e4fa100f5a 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >
> >  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >  {
> > -     return !arch_spin_is_locked(&lock);
> > +     u32 val = lock.counter;
> > +
> > +     return ((val >> 16) == (val & 0xffff));
>
> That almost certainly needs a READ_ONCE().
>
> The result is also inherently stale.
> So the uses must be pretty limited.
The previous read_once could get 64bit, use the API to check the 32bit
atomic data part.

>
>         David
>
> >  }
> >
> >  #include <asm/qrwlock.h>
> > --
> > 2.36.1
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-07-01 15:18       ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-01 15:18 UTC (permalink / raw)
  To: David Laight
  Cc: palmer, arnd, mingo, will, longman, boqun.feng, linux-riscv,
	linux-arch, linux-kernel, Guo Ren, Peter Zijlstra

On Wed, Jun 29, 2022 at 4:27 PM David Laight <David.Laight@aculab.com> wrote:
>
> From: guoren@kernel.org
> > Sent: 28 June 2022 09:17
> >
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> > because the value has been in lock. This patch could prevent
> > arch_spin_value_unlocked contend spin_lock data again.
>
> I'm confused (as usual).
> Isn't atomic_read() pretty much free?
When a cache line is shared with multi-harts, not as free as you think.
Preventing touching contended data is the basic principle.

atomic_read in alpha is heavy, It could be a potential user of ticket-lock.

>
> ..
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index fdfebcb050f4..f1e4fa100f5a 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >
> >  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >  {
> > -     return !arch_spin_is_locked(&lock);
> > +     u32 val = lock.counter;
> > +
> > +     return ((val >> 16) == (val & 0xffff));
>
> That almost certainly needs a READ_ONCE().
>
> The result is also inherently stale.
> So the uses must be pretty limited.
The previous read_once could get 64bit, use the API to check the 32bit
atomic data part.

>
>         David
>
> >  }
> >
> >  #include <asm/qrwlock.h>
> > --
> > 2.36.1
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-06-28  8:17   ` guoren
@ 2022-07-04  9:52     ` Peter Zijlstra
  -1 siblings, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2022-07-04  9:52 UTC (permalink / raw)
  To: guoren
  Cc: palmer, arnd, mingo, will, longman, boqun.feng, linux-riscv,
	linux-arch, linux-kernel, Guo Ren

On Tue, Jun 28, 2022 at 04:17:03AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> because the value has been in lock. This patch could prevent
> arch_spin_value_unlocked contend spin_lock data again.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>  include/asm-generic/spinlock.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..f1e4fa100f5a 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  
>  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> -	return !arch_spin_is_locked(&lock);
> +	u32 val = lock.counter;
> +
> +	return ((val >> 16) == (val & 0xffff));
>  }

Wouldn't the right thing be to flip arch_spin_is_locked() and
arch_spin_value_is_unlocked() ?


diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index fdfebcb050f4..63ab4da262f2 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -68,23 +68,25 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 	smp_store_release(ptr, (u16)val + 1);
 }
 
-static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
+static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
 	u32 val = atomic_read(lock);
 
-	return ((val >> 16) != (val & 0xffff));
+	return (s16)((val >> 16) - (val & 0xffff)) > 1;
 }
 
-static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = lock.counter;
 
-	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+	return ((val >> 16) == (val & 0xffff));
 }
 
-static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	return !arch_spin_is_locked(&lock);
+	arch_spinlock_t val = READ_ONCE(*lock);
+
+	return !arch_spin_value_unlocked(val);
 }
 
 #include <asm/qrwlock.h>


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-07-04  9:52     ` Peter Zijlstra
  0 siblings, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2022-07-04  9:52 UTC (permalink / raw)
  To: guoren
  Cc: palmer, arnd, mingo, will, longman, boqun.feng, linux-riscv,
	linux-arch, linux-kernel, Guo Ren

On Tue, Jun 28, 2022 at 04:17:03AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> because the value has been in lock. This patch could prevent
> arch_spin_value_unlocked contend spin_lock data again.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>  include/asm-generic/spinlock.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..f1e4fa100f5a 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  
>  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> -	return !arch_spin_is_locked(&lock);
> +	u32 val = lock.counter;
> +
> +	return ((val >> 16) == (val & 0xffff));
>  }

Wouldn't the right thing be to flip arch_spin_is_locked() and
arch_spin_value_is_unlocked() ?


diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index fdfebcb050f4..63ab4da262f2 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -68,23 +68,25 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 	smp_store_release(ptr, (u16)val + 1);
 }
 
-static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
+static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
 	u32 val = atomic_read(lock);
 
-	return ((val >> 16) != (val & 0xffff));
+	return (s16)((val >> 16) - (val & 0xffff)) > 1;
 }
 
-static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = lock.counter;
 
-	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+	return ((val >> 16) == (val & 0xffff));
 }
 
-static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	return !arch_spin_is_locked(&lock);
+	arch_spinlock_t val = READ_ONCE(*lock);
+
+	return !arch_spin_value_unlocked(val);
 }
 
 #include <asm/qrwlock.h>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-06-28  8:17   ` guoren
@ 2022-07-04  9:57     ` Peter Zijlstra
  -1 siblings, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2022-07-04  9:57 UTC (permalink / raw)
  To: guoren
  Cc: palmer, arnd, mingo, will, longman, boqun.feng, linux-riscv,
	linux-arch, linux-kernel, Guo Ren

On Tue, Jun 28, 2022 at 04:17:06AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Some architecture has a flexible requirement on the type of spinlock.
> Some LL/SC architectures of ISA don't force micro-arch to give a strong
> forward guarantee. Thus different kinds of memory model micro-arch would
> come out in one ISA. The ticket lock is suitable for exclusive monitor
> designed LL/SC micro-arch with limited cores and "!NUMA". The
> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> forward guarantee designed LL/SC micro-arch.
> 
> So, make the spinlock a combo with feature.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>  include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
>  kernel/locking/qspinlock.c     |  2 ++
>  2 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index f41dc7c2b900..a9b43089bf99 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -28,34 +28,73 @@
>  #define __ASM_GENERIC_SPINLOCK_H
>  
>  #include <asm-generic/ticket_spinlock.h>
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +#include <linux/jump_label.h>
> +#include <asm-generic/qspinlock.h>
> +
> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> +#endif
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
>  
>  static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>  {
> -	ticket_spin_lock(lock);
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		queued_spin_lock(lock);
> +	else
> +#endif
> +		ticket_spin_lock(lock);
>  }
>  
>  static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_trylock(lock);
> +#endif
>  	return ticket_spin_trylock(lock);
>  }
>  
>  static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
>  {
> -	ticket_spin_unlock(lock);
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		queued_spin_unlock(lock);
> +	else
> +#endif
> +		ticket_spin_unlock(lock);
>  }
>  
>  static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_is_locked(lock);
> +#endif
>  	return ticket_spin_is_locked(lock);
>  }
>  
>  static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_is_contended(lock);
> +#endif
>  	return ticket_spin_is_contended(lock);
>  }
>  
>  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_value_unlocked(lock);
> +#endif
>  	return ticket_spin_value_unlocked(lock);
>  }

Urggghhhh....

I really don't think you want this in generic code. Also, I'm thinking
any arch that does this wants to make sure it doesn't inline any of this
stuff. That is, said arch must not have ARCH_INLINE_SPIN_*

And if you're going to force things out of line, then I think you can
get better code using static_call().

*shudder*...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-07-04  9:57     ` Peter Zijlstra
  0 siblings, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2022-07-04  9:57 UTC (permalink / raw)
  To: guoren
  Cc: palmer, arnd, mingo, will, longman, boqun.feng, linux-riscv,
	linux-arch, linux-kernel, Guo Ren

On Tue, Jun 28, 2022 at 04:17:06AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> Some architecture has a flexible requirement on the type of spinlock.
> Some LL/SC architectures of ISA don't force micro-arch to give a strong
> forward guarantee. Thus different kinds of memory model micro-arch would
> come out in one ISA. The ticket lock is suitable for exclusive monitor
> designed LL/SC micro-arch with limited cores and "!NUMA". The
> queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> forward guarantee designed LL/SC micro-arch.
> 
> So, make the spinlock a combo with feature.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Palmer Dabbelt <palmer@rivosinc.com>
> ---
>  include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
>  kernel/locking/qspinlock.c     |  2 ++
>  2 files changed, 43 insertions(+), 2 deletions(-)
> 
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index f41dc7c2b900..a9b43089bf99 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -28,34 +28,73 @@
>  #define __ASM_GENERIC_SPINLOCK_H
>  
>  #include <asm-generic/ticket_spinlock.h>
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +#include <linux/jump_label.h>
> +#include <asm-generic/qspinlock.h>
> +
> +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> +#endif
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
>  
>  static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
>  {
> -	ticket_spin_lock(lock);
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		queued_spin_lock(lock);
> +	else
> +#endif
> +		ticket_spin_lock(lock);
>  }
>  
>  static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_trylock(lock);
> +#endif
>  	return ticket_spin_trylock(lock);
>  }
>  
>  static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
>  {
> -	ticket_spin_unlock(lock);
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		queued_spin_unlock(lock);
> +	else
> +#endif
> +		ticket_spin_unlock(lock);
>  }
>  
>  static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_is_locked(lock);
> +#endif
>  	return ticket_spin_is_locked(lock);
>  }
>  
>  static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_is_contended(lock);
> +#endif
>  	return ticket_spin_is_contended(lock);
>  }
>  
>  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> +	if (static_branch_likely(&use_qspinlock_key))
> +		return queued_spin_value_unlocked(lock);
> +#endif
>  	return ticket_spin_value_unlocked(lock);
>  }

Urggghhhh....

I really don't think you want this in generic code. Also, I'm thinking
any arch that does this wants to make sure it doesn't inline any of this
stuff. That is, said arch must not have ARCH_INLINE_SPIN_*

And if you're going to force things out of line, then I think you can
get better code using static_call().

*shudder*...

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
  2022-07-04  9:52     ` Peter Zijlstra
@ 2022-07-04 11:10       ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-04 11:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren

On Mon, Jul 4, 2022 at 5:52 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 28, 2022 at 04:17:03AM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> > because the value has been in lock. This patch could prevent
> > arch_spin_value_unlocked contend spin_lock data again.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >  include/asm-generic/spinlock.h | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index fdfebcb050f4..f1e4fa100f5a 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >
> >  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >  {
> > -     return !arch_spin_is_locked(&lock);
> > +     u32 val = lock.counter;
> > +
> > +     return ((val >> 16) == (val & 0xffff));
> >  }
>
> Wouldn't the right thing be to flip arch_spin_is_locked() and
> arch_spin_value_is_unlocked() ?
Okay, I agree with your patch. Next version, I would take the below code.

>
>
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..63ab4da262f2 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -68,23 +68,25 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
>         smp_store_release(ptr, (u16)val + 1);
>  }
>
> -static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
>         u32 val = atomic_read(lock);
>
> -       return ((val >> 16) != (val & 0xffff));
> +       return (s16)((val >> 16) - (val & 0xffff)) > 1;
>  }
>
> -static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> -       u32 val = atomic_read(lock);
> +       u32 val = lock.counter;
>
> -       return (s16)((val >> 16) - (val & 0xffff)) > 1;
> +       return ((val >> 16) == (val & 0xffff));
>  }
>
> -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> -       return !arch_spin_is_locked(&lock);
> +       arch_spinlock_t val = READ_ONCE(*lock);
> +
> +       return !arch_spin_value_unlocked(val);
>  }
>
>  #include <asm/qrwlock.h>
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read
@ 2022-07-04 11:10       ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-04 11:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren

On Mon, Jul 4, 2022 at 5:52 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 28, 2022 at 04:17:03AM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Remove unnecessary atomic_read in arch_spin_value_unlocked(lock),
> > because the value has been in lock. This patch could prevent
> > arch_spin_value_unlocked contend spin_lock data again.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >  include/asm-generic/spinlock.h | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index fdfebcb050f4..f1e4fa100f5a 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -84,7 +84,9 @@ static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >
> >  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >  {
> > -     return !arch_spin_is_locked(&lock);
> > +     u32 val = lock.counter;
> > +
> > +     return ((val >> 16) == (val & 0xffff));
> >  }
>
> Wouldn't the right thing be to flip arch_spin_is_locked() and
> arch_spin_value_is_unlocked() ?
Okay, I agree with your patch. Next version, I would take the below code.

>
>
> diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> index fdfebcb050f4..63ab4da262f2 100644
> --- a/include/asm-generic/spinlock.h
> +++ b/include/asm-generic/spinlock.h
> @@ -68,23 +68,25 @@ static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
>         smp_store_release(ptr, (u16)val + 1);
>  }
>
> -static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
>  {
>         u32 val = atomic_read(lock);
>
> -       return ((val >> 16) != (val & 0xffff));
> +       return (s16)((val >> 16) - (val & 0xffff)) > 1;
>  }
>
> -static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
>  {
> -       u32 val = atomic_read(lock);
> +       u32 val = lock.counter;
>
> -       return (s16)((val >> 16) - (val & 0xffff)) > 1;
> +       return ((val >> 16) == (val & 0xffff));
>  }
>
> -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
>  {
> -       return !arch_spin_is_locked(&lock);
> +       arch_spinlock_t val = READ_ONCE(*lock);
> +
> +       return !arch_spin_value_unlocked(val);
>  }
>
>  #include <asm/qrwlock.h>
>


-- 
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-07-04  9:57     ` Peter Zijlstra
@ 2022-07-04 13:13       ` Guo Ren
  -1 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-04 13:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren

On Mon, Jul 4, 2022 at 5:58 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 28, 2022 at 04:17:06AM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Some architecture has a flexible requirement on the type of spinlock.
> > Some LL/SC architectures of ISA don't force micro-arch to give a strong
> > forward guarantee. Thus different kinds of memory model micro-arch would
> > come out in one ISA. The ticket lock is suitable for exclusive monitor
> > designed LL/SC micro-arch with limited cores and "!NUMA". The
> > queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> > forward guarantee designed LL/SC micro-arch.
> >
> > So, make the spinlock a combo with feature.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >  include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
> >  kernel/locking/qspinlock.c     |  2 ++
> >  2 files changed, 43 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index f41dc7c2b900..a9b43089bf99 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -28,34 +28,73 @@
> >  #define __ASM_GENERIC_SPINLOCK_H
> >
> >  #include <asm-generic/ticket_spinlock.h>
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +#include <linux/jump_label.h>
> > +#include <asm-generic/qspinlock.h>
> > +
> > +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> > +#endif
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> >
> >  static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> >  {
> > -     ticket_spin_lock(lock);
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             queued_spin_lock(lock);
> > +     else
> > +#endif
> > +             ticket_spin_lock(lock);
> >  }
> >
> >  static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_trylock(lock);
> > +#endif
> >       return ticket_spin_trylock(lock);
> >  }
> >
> >  static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> >  {
> > -     ticket_spin_unlock(lock);
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             queued_spin_unlock(lock);
> > +     else
> > +#endif
> > +             ticket_spin_unlock(lock);
> >  }
> >
> >  static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_is_locked(lock);
> > +#endif
> >       return ticket_spin_is_locked(lock);
> >  }
> >
> >  static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_is_contended(lock);
> > +#endif
> >       return ticket_spin_is_contended(lock);
> >  }
> >
> >  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_value_unlocked(lock);
> > +#endif
> >       return ticket_spin_value_unlocked(lock);
> >  }
>
> Urggghhhh....
>
> I really don't think you want this in generic code. Also, I'm thinking
> any arch that does this wants to make sure it doesn't inline any of this
Your advice is the same with Arnd, I would move static_branch out of generic.

> stuff. That is, said arch must not have ARCH_INLINE_SPIN_*
What do you mean? I've tested with ARCH_INLINE_SPIN_* and it's okay
with EXPORT_SYMBOL(use_qspinlock_key).

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 47e12ab9c822..4587fb544326 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -32,6 +32,32 @@ config RISCV
        select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL
        select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
        select ARCH_HAS_UBSAN_SANITIZE_ALL
+       select ARCH_INLINE_READ_LOCK if !PREEMPTION
+       select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
+       select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
+       select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
+       select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
        select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
        select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
        select ARCH_STACKWALK

Shall I add the above diff in the next version of the patch series?

>
> And if you're going to force things out of line, then I think you can
> get better code using static_call().
Good point, thx.

>
> *shudder*...



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-07-04 13:13       ` Guo Ren
  0 siblings, 0 replies; 48+ messages in thread
From: Guo Ren @ 2022-07-04 13:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren

On Mon, Jul 4, 2022 at 5:58 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Jun 28, 2022 at 04:17:06AM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Some architecture has a flexible requirement on the type of spinlock.
> > Some LL/SC architectures of ISA don't force micro-arch to give a strong
> > forward guarantee. Thus different kinds of memory model micro-arch would
> > come out in one ISA. The ticket lock is suitable for exclusive monitor
> > designed LL/SC micro-arch with limited cores and "!NUMA". The
> > queue-spinlock could deal with NUMA/large-scale scenarios with a strong
> > forward guarantee designed LL/SC micro-arch.
> >
> > So, make the spinlock a combo with feature.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Palmer Dabbelt <palmer@rivosinc.com>
> > ---
> >  include/asm-generic/spinlock.h | 43 ++++++++++++++++++++++++++++++++--
> >  kernel/locking/qspinlock.c     |  2 ++
> >  2 files changed, 43 insertions(+), 2 deletions(-)
> >
> > diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
> > index f41dc7c2b900..a9b43089bf99 100644
> > --- a/include/asm-generic/spinlock.h
> > +++ b/include/asm-generic/spinlock.h
> > @@ -28,34 +28,73 @@
> >  #define __ASM_GENERIC_SPINLOCK_H
> >
> >  #include <asm-generic/ticket_spinlock.h>
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +#include <linux/jump_label.h>
> > +#include <asm-generic/qspinlock.h>
> > +
> > +DECLARE_STATIC_KEY_TRUE(use_qspinlock_key);
> > +#endif
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> >
> >  static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> >  {
> > -     ticket_spin_lock(lock);
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             queued_spin_lock(lock);
> > +     else
> > +#endif
> > +             ticket_spin_lock(lock);
> >  }
> >
> >  static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_trylock(lock);
> > +#endif
> >       return ticket_spin_trylock(lock);
> >  }
> >
> >  static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> >  {
> > -     ticket_spin_unlock(lock);
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             queued_spin_unlock(lock);
> > +     else
> > +#endif
> > +             ticket_spin_unlock(lock);
> >  }
> >
> >  static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_is_locked(lock);
> > +#endif
> >       return ticket_spin_is_locked(lock);
> >  }
> >
> >  static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_is_contended(lock);
> > +#endif
> >       return ticket_spin_is_contended(lock);
> >  }
> >
> >  static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> >  {
> > +#ifdef CONFIG_ARCH_USE_QUEUED_SPINLOCKS
> > +     if (static_branch_likely(&use_qspinlock_key))
> > +             return queued_spin_value_unlocked(lock);
> > +#endif
> >       return ticket_spin_value_unlocked(lock);
> >  }
>
> Urggghhhh....
>
> I really don't think you want this in generic code. Also, I'm thinking
> any arch that does this wants to make sure it doesn't inline any of this
Your advice is the same with Arnd, I would move static_branch out of generic.

> stuff. That is, said arch must not have ARCH_INLINE_SPIN_*
What do you mean? I've tested with ARCH_INLINE_SPIN_* and it's okay
with EXPORT_SYMBOL(use_qspinlock_key).

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 47e12ab9c822..4587fb544326 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -32,6 +32,32 @@ config RISCV
        select ARCH_HAS_STRICT_MODULE_RWX if MMU && !XIP_KERNEL
        select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
        select ARCH_HAS_UBSAN_SANITIZE_ALL
+       select ARCH_INLINE_READ_LOCK if !PREEMPTION
+       select ARCH_INLINE_READ_LOCK_BH if !PREEMPTION
+       select ARCH_INLINE_READ_LOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_READ_LOCK_IRQSAVE if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_READ_UNLOCK_IRQRESTORE if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK_BH if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_WRITE_LOCK_IRQSAVE if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE if !PREEMPTION
+       select ARCH_INLINE_SPIN_TRYLOCK if !PREEMPTION
+       select ARCH_INLINE_SPIN_TRYLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK_BH if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_SPIN_LOCK_IRQSAVE if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK_BH if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK_IRQ if !PREEMPTION
+       select ARCH_INLINE_SPIN_UNLOCK_IRQRESTORE if !PREEMPTION
        select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
        select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
        select ARCH_STACKWALK

Shall I add the above diff in the next version of the patch series?

>
> And if you're going to force things out of line, then I think you can
> get better code using static_call().
Good point, thx.

>
> *shudder*...



--
Best Regards
 Guo Ren

ML: https://lore.kernel.org/linux-csky/

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
  2022-07-04 13:13       ` Guo Ren
@ 2022-07-04 13:45         ` Peter Zijlstra
  -1 siblings, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2022-07-04 13:45 UTC (permalink / raw)
  To: Guo Ren
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren

On Mon, Jul 04, 2022 at 09:13:40PM +0800, Guo Ren wrote:

> > Urggghhhh....
> >
> > I really don't think you want this in generic code. Also, I'm thinking
> > any arch that does this wants to make sure it doesn't inline any of this
> Your advice is the same with Arnd, I would move static_branch out of generic.
> 
> > stuff. That is, said arch must not have ARCH_INLINE_SPIN_*
> What do you mean? I've tested with ARCH_INLINE_SPIN_* and it's okay
> with EXPORT_SYMBOL(use_qspinlock_key).

Well, with the static_branch and the two paths I just don't see the code
being sane/small enough to inline. I mean, sure, you can force it to
inline the thing, but I'm not sure that's wise.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued)
@ 2022-07-04 13:45         ` Peter Zijlstra
  0 siblings, 0 replies; 48+ messages in thread
From: Peter Zijlstra @ 2022-07-04 13:45 UTC (permalink / raw)
  To: Guo Ren
  Cc: Palmer Dabbelt, Arnd Bergmann, Ingo Molnar, Will Deacon,
	Waiman Long, Boqun Feng, linux-riscv, linux-arch,
	Linux Kernel Mailing List, Guo Ren

On Mon, Jul 04, 2022 at 09:13:40PM +0800, Guo Ren wrote:

> > Urggghhhh....
> >
> > I really don't think you want this in generic code. Also, I'm thinking
> > any arch that does this wants to make sure it doesn't inline any of this
> Your advice is the same with Arnd, I would move static_branch out of generic.
> 
> > stuff. That is, said arch must not have ARCH_INLINE_SPIN_*
> What do you mean? I've tested with ARCH_INLINE_SPIN_* and it's okay
> with EXPORT_SYMBOL(use_qspinlock_key).

Well, with the static_branch and the two paths I just don't see the code
being sane/small enough to inline. I mean, sure, you can force it to
inline the thing, but I'm not sure that's wise.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2022-07-04 13:46 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-28  8:17 [PATCH V7 0/5] riscv: Add qspinlock support with combo style guoren
2022-06-28  8:17 ` guoren
2022-06-28  8:17 ` [PATCH V7 1/5] asm-generic: ticket-lock: Remove unnecessary atomic_read guoren
2022-06-28  8:17   ` guoren
2022-06-28 18:05   ` Waiman Long
2022-06-28 18:05     ` Waiman Long
2022-06-29  2:12     ` Guo Ren
2022-06-29  2:12       ` Guo Ren
2022-06-29  8:27   ` David Laight
2022-06-29  8:27     ` David Laight
2022-07-01 15:18     ` Guo Ren
2022-07-01 15:18       ` Guo Ren
2022-07-04  9:52   ` Peter Zijlstra
2022-07-04  9:52     ` Peter Zijlstra
2022-07-04 11:10     ` Guo Ren
2022-07-04 11:10       ` Guo Ren
2022-06-28  8:17 ` [PATCH V7 2/5] asm-generic: ticket-lock: Use the same struct definitions with qspinlock guoren
2022-06-28  8:17   ` guoren
2022-06-28  8:17 ` [PATCH V7 3/5] asm-generic: ticket-lock: Move into ticket_spinlock.h guoren
2022-06-28  8:17   ` guoren
2022-06-28  8:17 ` [PATCH V7 4/5] asm-generic: spinlock: Add combo spinlock (ticket & queued) guoren
2022-06-28  8:17   ` guoren
2022-06-28 18:13   ` Waiman Long
2022-06-28 18:13     ` Waiman Long
2022-06-29  1:17     ` Guo Ren
2022-06-29  1:17       ` Guo Ren
2022-06-29  1:34       ` Waiman Long
2022-06-29  1:34         ` Waiman Long
2022-06-29  2:29         ` Guo Ren
2022-06-29  2:29           ` Guo Ren
2022-06-29  7:08         ` Arnd Bergmann
2022-06-29  7:08           ` Arnd Bergmann
2022-06-29  8:24           ` Guo Ren
2022-06-29  8:24             ` Guo Ren
2022-06-29  8:29             ` Arnd Bergmann
2022-06-29  8:29               ` Arnd Bergmann
2022-07-01 12:18               ` Guo Ren
2022-07-01 12:18                 ` Guo Ren
2022-06-29 12:53           ` Waiman Long
2022-06-29 12:53             ` Waiman Long
2022-07-04  9:57   ` Peter Zijlstra
2022-07-04  9:57     ` Peter Zijlstra
2022-07-04 13:13     ` Guo Ren
2022-07-04 13:13       ` Guo Ren
2022-07-04 13:45       ` Peter Zijlstra
2022-07-04 13:45         ` Peter Zijlstra
2022-06-28  8:17 ` [PATCH V7 5/5] riscv: Add qspinlock support guoren
2022-06-28  8:17   ` guoren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.