All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V10 00/19] riscv: Add Native/Paravirt/CNA qspinlock support
@ 2023-08-02 16:46 ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Hello everyone,

I'm back after one year. This is the tenth version of riscv qspinlock.

patch[1 - 8]: Native qspinlock
patch[9 -17]: Paravirt qspinlock
patch[18-19]: Compact NUMA-awared (CNA) qspinlock

This series based on Andrew Jones' pv-time patches and Alex Kogan's CNA
qspinlock patches. I merge them into sg2042-master branch, then you could
directly try it:

https://github.com/guoren83/linux/tree/qspinlock_v10_pvlock_cna_qspinlock_v15

Use sophgo_mango_ubuntu_defconfig for sg2042 64 cores board.

Native qspinlock
================

This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
gives stability and performance improvement. All T-HEAD processors have
a strong LR/SC forward progress guarantee than the requirements of the
ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
qspinlock has been run with us for more than 1 year, and we have enough
confidence to enable it for all the T-HEAD processors. Of causes, we
found a livelock problem with the qspinlock lock torture test from the
CPU store merge buffer delay mechanism, which caused the queued spinlock
becomes a dead ring and RCU warning to come out. We introduce a custom
WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
it? Or let hardware handle this.

We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
test on Fedora & Ubuntu & OpenEuler ... Here is the performance
comparison between qspinlock and ticket_lock on sg2042 (64 cores):

sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
  queued_spinlock 0.5109/0.00
  ticket_spinlock 0.5814/0.00

perf futex/hash (+6.7%):
  queued_spinlock 1444393 operations/sec (+- 0.09%)
  ticket_spinlock 1353215 operations/sec (+- 0.15%)

perf futex/wake-parallel (+8.6%):
  queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
  ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)

perf futex/requeue (+4.2%):
  queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
  ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)

System Benchmarks (+6.4%)
  queued_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
    Double-Precision Whetstone                       55.0     182422.8  33167.8
    Execl Throughput                                 43.0      13116.6   3050.4
    File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
    File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
    Pipe Throughput                               12440.0   23058600.5  18535.9
    Pipe-based Context Switching                   4000.0    2835617.7   7089.0
    Process Creation                                126.0      12537.3    995.0
    Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
    Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
    System Call Overhead                          15000.0   33308301.3  22205.5
                                                                       ========
    System Benchmarks Index Score                                       12426.1

  ticket_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
    Double-Precision Whetstone                       55.0     181921.0  33076.5
    Execl Throughput                                 43.0      12625.1   2936.1
    File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
    File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
    Pipe Throughput                               12440.0   20594018.7  16554.7
    Pipe-based Context Switching                   4000.0    2571117.7   6427.8
    Process Creation                                126.0      10798.4    857.0
    Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
    Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
    System Call Overhead                          15000.0   30766778.4  20511.2
                                                                       ========
    System Benchmarks Index Score                                       11670.7

The qspinlock has a significant improvement on SOPHGO SG2042 64
cores platform than the ticket_lock.

Paravirt qspinlock
==================

Based on Andrew Jones' "Add skeleton for pv-time support" patches to unify the
paravirt framework.

We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
name and number are approved, I will send a formal proposal to the SBI spec.

Compact NUMA-awared (CNA) qspinlock
===================================

Based on Alex Kogan's "Add NUMA-awareness to qspinlock" patches.

The multi sg2042 chips could compose 2/4 NUMA nodes, and these pathes are the
preparation for the next test. I hope "CNA v.s. native" could have a better
improvement than "qspinlock v.s. ticket_lock." We tested it on sg2042 hardware
with "numa_spinlock=on" to ensure the software is bug-free before the next
multi-nodes hardware comes out.

Changlog:
V10:
 - Using an alternative framework instead of static_key_branch in the
   asm/spinlock.h.
 - Fixup store merge buffer problem, which causes qspinlock lock
   torture test livelock.
 - Add paravirt qspinlock support, include KVM backend
 - Add Compact NUMA-awared qspinlock support 

V9:
https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
 - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
   RCsc)
 - Add qspinlock and combo-lock for riscv
 - Add qspinlock to openrisc
 - Use generic header in csky
 - Optimize cmpxchg & atomic code

V8:
https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
 - Coding convention ticket fixup
 - Move combo spinlock into riscv and simply asm-generic/spinlock.h
 - Fixup xchg16 with wrong return value
 - Add csky qspinlock
 - Add combo & qspinlock & ticket-lock comparison
 - Clean up unnecessary riscv acquire and release definitions
 - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky

V7:
https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
 - Add combo spinlock (ticket & queued) support
 - Rename ticket_spinlock.h
 - Remove unnecessary atomic_read in ticket_spin_value_unlocked  

V6:
https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
 - Fixup Clang compile problem Reported-by: kernel test robot
 - Cleanup asm-generic/spinlock.h
 - Remove changelog in patch main comment part, suggested by
   Conor.Dooley
 - Remove "default y if NUMA" in Kconfig

V5:
https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
 - Update comment with RISC-V forward guarantee feature.
 - Back to V3 direction and optimize asm code.

V4:
https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
 - Remove custom sub-word xchg implementation
 - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock

V3:
https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
 - Coding convention by Peter Zijlstra's advices

V2:
https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
 - Coding convention in cmpxchg.h
 - Re-implement short xchg
 - Remove char & cmpxchg implementations

V1:
https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
 - Using cmpxchg loop to implement sub-word atomic

Guo Ren (19):
  asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  asm-generic: ticket-lock: Move into ticket_spinlock.h
  riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
  riscv: qspinlock: Add basic queued_spinlock support
  riscv: qspinlock: Introduce combo spinlock
  riscv: qspinlock: Allow force qspinlock from the command line
  riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  riscv: qspinlock: Use new static key for controlling call of
    virt_spin_lock()
  RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
  RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
  RISC-V: paravirt: pvqspinlock: KVM: Implement
    kvm_sbi_ext_pvlock_kick_cpu()
  RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
  RISC-V: paravirt: pvqspinlock: Remove unnecessary definitions of
    cmpxchg & xchg
  RISC-V: paravirt: pvqspinlock: Add xchg8 & cmpxchg_small support
  RISC-V: paravirt: pvqspinlock: Add SBI implementation
  RISC-V: paravirt: pvqspinlock: Add kconfig entry
  RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
  locking/qspinlock: Move pv_ops into x86 directory
  locking/qspinlock: riscv: Add Compact NUMA-aware lock support

 .../admin-guide/kernel-parameters.txt         |   5 +-
 arch/riscv/Kconfig                            |  54 ++++
 arch/riscv/Kconfig.errata                     |  32 ++
 arch/riscv/errata/thead/errata.c              |  44 +++
 arch/riscv/include/asm/Kbuild                 |   2 +-
 arch/riscv/include/asm/cmpxchg.h              | 286 ++++++++++++------
 arch/riscv/include/asm/cpufeature.h           |   2 +
 arch/riscv/include/asm/errata_list.h          |  33 +-
 arch/riscv/include/asm/hwcap.h                |   1 +
 arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
 arch/riscv/include/asm/paravirt.h             |  20 ++
 arch/riscv/include/asm/qspinlock.h            |  34 +++
 arch/riscv/include/asm/qspinlock_paravirt.h   |   7 +
 arch/riscv/include/asm/rwonce.h               |  24 ++
 arch/riscv/include/asm/sbi.h                  |  14 +
 arch/riscv/include/asm/spinlock.h             | 122 ++++++++
 arch/riscv/include/asm/vendorid_list.h        |  15 +
 arch/riscv/include/uapi/asm/kvm.h             |   1 +
 arch/riscv/kernel/cpufeature.c                |  26 ++
 arch/riscv/kernel/paravirt.c                  |  86 ++++++
 arch/riscv/kernel/sbi.c                       |   2 +-
 arch/riscv/kernel/setup.c                     |  22 ++
 .../kernel/trace_events_filter_paravirt.h     |  60 ++++
 arch/riscv/kvm/Makefile                       |   1 +
 arch/riscv/kvm/vcpu_sbi.c                     |   4 +
 arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 ++++
 arch/x86/include/asm/qspinlock.h              |   3 +-
 arch/x86/kernel/alternative.c                 |   6 +-
 include/asm-generic/rwonce.h                  |   2 +
 include/asm-generic/spinlock.h                |  87 +-----
 include/asm-generic/spinlock_types.h          |  12 +-
 include/asm-generic/ticket_spinlock.h         | 103 +++++++
 kernel/locking/qspinlock_cna.h                |  14 +-
 33 files changed, 971 insertions(+), 211 deletions(-)
 create mode 100644 arch/riscv/include/asm/qspinlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/riscv/include/asm/rwonce.h
 create mode 100644 arch/riscv/include/asm/spinlock.h
 create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
 create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
 create mode 100644 include/asm-generic/ticket_spinlock.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH V10 00/19] riscv: Add Native/Paravirt/CNA qspinlock support
@ 2023-08-02 16:46 ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Hello everyone,

I'm back after one year. This is the tenth version of riscv qspinlock.

patch[1 - 8]: Native qspinlock
patch[9 -17]: Paravirt qspinlock
patch[18-19]: Compact NUMA-awared (CNA) qspinlock

This series based on Andrew Jones' pv-time patches and Alex Kogan's CNA
qspinlock patches. I merge them into sg2042-master branch, then you could
directly try it:

https://github.com/guoren83/linux/tree/qspinlock_v10_pvlock_cna_qspinlock_v15

Use sophgo_mango_ubuntu_defconfig for sg2042 64 cores board.

Native qspinlock
================

This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
gives stability and performance improvement. All T-HEAD processors have
a strong LR/SC forward progress guarantee than the requirements of the
ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
qspinlock has been run with us for more than 1 year, and we have enough
confidence to enable it for all the T-HEAD processors. Of causes, we
found a livelock problem with the qspinlock lock torture test from the
CPU store merge buffer delay mechanism, which caused the queued spinlock
becomes a dead ring and RCU warning to come out. We introduce a custom
WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
it? Or let hardware handle this.

We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
test on Fedora & Ubuntu & OpenEuler ... Here is the performance
comparison between qspinlock and ticket_lock on sg2042 (64 cores):

sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
  queued_spinlock 0.5109/0.00
  ticket_spinlock 0.5814/0.00

perf futex/hash (+6.7%):
  queued_spinlock 1444393 operations/sec (+- 0.09%)
  ticket_spinlock 1353215 operations/sec (+- 0.15%)

perf futex/wake-parallel (+8.6%):
  queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
  ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)

perf futex/requeue (+4.2%):
  queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
  ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)

System Benchmarks (+6.4%)
  queued_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
    Double-Precision Whetstone                       55.0     182422.8  33167.8
    Execl Throughput                                 43.0      13116.6   3050.4
    File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
    File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
    Pipe Throughput                               12440.0   23058600.5  18535.9
    Pipe-based Context Switching                   4000.0    2835617.7   7089.0
    Process Creation                                126.0      12537.3    995.0
    Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
    Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
    System Call Overhead                          15000.0   33308301.3  22205.5
                                                                       ========
    System Benchmarks Index Score                                       12426.1

  ticket_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
    Double-Precision Whetstone                       55.0     181921.0  33076.5
    Execl Throughput                                 43.0      12625.1   2936.1
    File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
    File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
    Pipe Throughput                               12440.0   20594018.7  16554.7
    Pipe-based Context Switching                   4000.0    2571117.7   6427.8
    Process Creation                                126.0      10798.4    857.0
    Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
    Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
    System Call Overhead                          15000.0   30766778.4  20511.2
                                                                       ========
    System Benchmarks Index Score                                       11670.7

The qspinlock has a significant improvement on SOPHGO SG2042 64
cores platform than the ticket_lock.

Paravirt qspinlock
==================

Based on Andrew Jones' "Add skeleton for pv-time support" patches to unify the
paravirt framework.

We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
name and number are approved, I will send a formal proposal to the SBI spec.

Compact NUMA-awared (CNA) qspinlock
===================================

Based on Alex Kogan's "Add NUMA-awareness to qspinlock" patches.

The multi sg2042 chips could compose 2/4 NUMA nodes, and these pathes are the
preparation for the next test. I hope "CNA v.s. native" could have a better
improvement than "qspinlock v.s. ticket_lock." We tested it on sg2042 hardware
with "numa_spinlock=on" to ensure the software is bug-free before the next
multi-nodes hardware comes out.

Changlog:
V10:
 - Using an alternative framework instead of static_key_branch in the
   asm/spinlock.h.
 - Fixup store merge buffer problem, which causes qspinlock lock
   torture test livelock.
 - Add paravirt qspinlock support, include KVM backend
 - Add Compact NUMA-awared qspinlock support 

V9:
https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
 - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
   RCsc)
 - Add qspinlock and combo-lock for riscv
 - Add qspinlock to openrisc
 - Use generic header in csky
 - Optimize cmpxchg & atomic code

V8:
https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
 - Coding convention ticket fixup
 - Move combo spinlock into riscv and simply asm-generic/spinlock.h
 - Fixup xchg16 with wrong return value
 - Add csky qspinlock
 - Add combo & qspinlock & ticket-lock comparison
 - Clean up unnecessary riscv acquire and release definitions
 - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky

V7:
https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
 - Add combo spinlock (ticket & queued) support
 - Rename ticket_spinlock.h
 - Remove unnecessary atomic_read in ticket_spin_value_unlocked  

V6:
https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
 - Fixup Clang compile problem Reported-by: kernel test robot
 - Cleanup asm-generic/spinlock.h
 - Remove changelog in patch main comment part, suggested by
   Conor.Dooley
 - Remove "default y if NUMA" in Kconfig

V5:
https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
 - Update comment with RISC-V forward guarantee feature.
 - Back to V3 direction and optimize asm code.

V4:
https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
 - Remove custom sub-word xchg implementation
 - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock

V3:
https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
 - Coding convention by Peter Zijlstra's advices

V2:
https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
 - Coding convention in cmpxchg.h
 - Re-implement short xchg
 - Remove char & cmpxchg implementations

V1:
https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
 - Using cmpxchg loop to implement sub-word atomic

Guo Ren (19):
  asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  asm-generic: ticket-lock: Move into ticket_spinlock.h
  riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
  riscv: qspinlock: Add basic queued_spinlock support
  riscv: qspinlock: Introduce combo spinlock
  riscv: qspinlock: Allow force qspinlock from the command line
  riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  riscv: qspinlock: Use new static key for controlling call of
    virt_spin_lock()
  RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
  RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
  RISC-V: paravirt: pvqspinlock: KVM: Implement
    kvm_sbi_ext_pvlock_kick_cpu()
  RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
  RISC-V: paravirt: pvqspinlock: Remove unnecessary definitions of
    cmpxchg & xchg
  RISC-V: paravirt: pvqspinlock: Add xchg8 & cmpxchg_small support
  RISC-V: paravirt: pvqspinlock: Add SBI implementation
  RISC-V: paravirt: pvqspinlock: Add kconfig entry
  RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
  locking/qspinlock: Move pv_ops into x86 directory
  locking/qspinlock: riscv: Add Compact NUMA-aware lock support

 .../admin-guide/kernel-parameters.txt         |   5 +-
 arch/riscv/Kconfig                            |  54 ++++
 arch/riscv/Kconfig.errata                     |  32 ++
 arch/riscv/errata/thead/errata.c              |  44 +++
 arch/riscv/include/asm/Kbuild                 |   2 +-
 arch/riscv/include/asm/cmpxchg.h              | 286 ++++++++++++------
 arch/riscv/include/asm/cpufeature.h           |   2 +
 arch/riscv/include/asm/errata_list.h          |  33 +-
 arch/riscv/include/asm/hwcap.h                |   1 +
 arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
 arch/riscv/include/asm/paravirt.h             |  20 ++
 arch/riscv/include/asm/qspinlock.h            |  34 +++
 arch/riscv/include/asm/qspinlock_paravirt.h   |   7 +
 arch/riscv/include/asm/rwonce.h               |  24 ++
 arch/riscv/include/asm/sbi.h                  |  14 +
 arch/riscv/include/asm/spinlock.h             | 122 ++++++++
 arch/riscv/include/asm/vendorid_list.h        |  15 +
 arch/riscv/include/uapi/asm/kvm.h             |   1 +
 arch/riscv/kernel/cpufeature.c                |  26 ++
 arch/riscv/kernel/paravirt.c                  |  86 ++++++
 arch/riscv/kernel/sbi.c                       |   2 +-
 arch/riscv/kernel/setup.c                     |  22 ++
 .../kernel/trace_events_filter_paravirt.h     |  60 ++++
 arch/riscv/kvm/Makefile                       |   1 +
 arch/riscv/kvm/vcpu_sbi.c                     |   4 +
 arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 ++++
 arch/x86/include/asm/qspinlock.h              |   3 +-
 arch/x86/kernel/alternative.c                 |   6 +-
 include/asm-generic/rwonce.h                  |   2 +
 include/asm-generic/spinlock.h                |  87 +-----
 include/asm-generic/spinlock_types.h          |  12 +-
 include/asm-generic/ticket_spinlock.h         | 103 +++++++
 kernel/locking/qspinlock_cna.h                |  14 +-
 33 files changed, 971 insertions(+), 211 deletions(-)
 create mode 100644 arch/riscv/include/asm/qspinlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/riscv/include/asm/rwonce.h
 create mode 100644 arch/riscv/include/asm/spinlock.h
 create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
 create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
 create mode 100644 include/asm-generic/ticket_spinlock.h

-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH V10 01/19] asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The arch_spinlock_t of qspinlock has contained the atomic_t val, which
satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t
into qspinlock_types.h. This is the preparation for the next combo
spinlock.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
---
 include/asm-generic/spinlock.h       | 14 +++++++-------
 include/asm-generic/spinlock_types.h | 12 ++----------
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 90803a826ba0..4773334ee638 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -32,7 +32,7 @@
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	u32 val = atomic_fetch_add(1<<16, lock);
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
 	u16 ticket = val >> 16;
 
 	if (ticket == (u16)val)
@@ -46,31 +46,31 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 	 * have no outstanding writes due to the atomic_fetch_add() the extra
 	 * orderings are free.
 	 */
-	atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
 	smp_mb();
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
-	u32 old = atomic_read(lock);
+	u32 old = atomic_read(&lock->val);
 
 	if ((old >> 16) != (old & 0xffff))
 		return false;
 
-	return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
 	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	smp_store_release(ptr, (u16)val + 1);
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = lock.counter;
+	u32 val = lock.val.counter;
 
 	return ((val >> 16) == (val & 0xffff));
 }
@@ -84,7 +84,7 @@ static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	return (s16)((val >> 16) - (val & 0xffff)) > 1;
 }
diff --git a/include/asm-generic/spinlock_types.h b/include/asm-generic/spinlock_types.h
index 8962bb730945..f534aa5de394 100644
--- a/include/asm-generic/spinlock_types.h
+++ b/include/asm-generic/spinlock_types.h
@@ -3,15 +3,7 @@
 #ifndef __ASM_GENERIC_SPINLOCK_TYPES_H
 #define __ASM_GENERIC_SPINLOCK_TYPES_H
 
-#include <linux/types.h>
-typedef atomic_t arch_spinlock_t;
-
-/*
- * qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
- * include.
- */
-#include <asm/qrwlock_types.h>
-
-#define __ARCH_SPIN_LOCK_UNLOCKED	ATOMIC_INIT(0)
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
 
 #endif /* __ASM_GENERIC_SPINLOCK_TYPES_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 01/19] asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The arch_spinlock_t of qspinlock has contained the atomic_t val, which
satisfies the ticket-lock requirement. Thus, unify the arch_spinlock_t
into qspinlock_types.h. This is the preparation for the next combo
spinlock.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
---
 include/asm-generic/spinlock.h       | 14 +++++++-------
 include/asm-generic/spinlock_types.h | 12 ++----------
 2 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 90803a826ba0..4773334ee638 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -32,7 +32,7 @@
 
 static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 {
-	u32 val = atomic_fetch_add(1<<16, lock);
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
 	u16 ticket = val >> 16;
 
 	if (ticket == (u16)val)
@@ -46,31 +46,31 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
 	 * have no outstanding writes due to the atomic_fetch_add() the extra
 	 * orderings are free.
 	 */
-	atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
 	smp_mb();
 }
 
 static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
 {
-	u32 old = atomic_read(lock);
+	u32 old = atomic_read(&lock->val);
 
 	if ((old >> 16) != (old & 0xffff))
 		return false;
 
-	return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
 }
 
 static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
 	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	smp_store_release(ptr, (u16)val + 1);
 }
 
 static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
-	u32 val = lock.counter;
+	u32 val = lock.val.counter;
 
 	return ((val >> 16) == (val & 0xffff));
 }
@@ -84,7 +84,7 @@ static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
 
 static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
 {
-	u32 val = atomic_read(lock);
+	u32 val = atomic_read(&lock->val);
 
 	return (s16)((val >> 16) - (val & 0xffff)) > 1;
 }
diff --git a/include/asm-generic/spinlock_types.h b/include/asm-generic/spinlock_types.h
index 8962bb730945..f534aa5de394 100644
--- a/include/asm-generic/spinlock_types.h
+++ b/include/asm-generic/spinlock_types.h
@@ -3,15 +3,7 @@
 #ifndef __ASM_GENERIC_SPINLOCK_TYPES_H
 #define __ASM_GENERIC_SPINLOCK_TYPES_H
 
-#include <linux/types.h>
-typedef atomic_t arch_spinlock_t;
-
-/*
- * qrwlock_types depends on arch_spinlock_t, so we must typedef that before the
- * include.
- */
-#include <asm/qrwlock_types.h>
-
-#define __ARCH_SPIN_LOCK_UNLOCKED	ATOMIC_INIT(0)
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
 
 #endif /* __ASM_GENERIC_SPINLOCK_TYPES_H */
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 02/19] asm-generic: ticket-lock: Move into ticket_spinlock.h
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Move ticket-lock definition into an independent file. This is the
preparation for the next combo spinlock of riscv.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 include/asm-generic/spinlock.h        |  87 +---------------------
 include/asm-generic/ticket_spinlock.h | 103 ++++++++++++++++++++++++++
 2 files changed, 104 insertions(+), 86 deletions(-)
 create mode 100644 include/asm-generic/ticket_spinlock.h

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 4773334ee638..970590baf61b 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -1,94 +1,9 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
-/*
- * 'Generic' ticket-lock implementation.
- *
- * It relies on atomic_fetch_add() having well defined forward progress
- * guarantees under contention. If your architecture cannot provide this, stick
- * to a test-and-set lock.
- *
- * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
- * sub-word of the value. This is generally true for anything LL/SC although
- * you'd be hard pressed to find anything useful in architecture specifications
- * about this. If your architecture cannot do this you might be better off with
- * a test-and-set.
- *
- * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
- * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
- * a full fence after the spin to upgrade the otherwise-RCpc
- * atomic_cond_read_acquire().
- *
- * The implementation uses smp_cond_load_acquire() to spin, so if the
- * architecture has WFE like instructions to sleep instead of poll for word
- * modifications be sure to implement that (see ARM64 for example).
- *
- */
-
 #ifndef __ASM_GENERIC_SPINLOCK_H
 #define __ASM_GENERIC_SPINLOCK_H
 
-#include <linux/atomic.h>
-#include <asm-generic/spinlock_types.h>
-
-static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
-{
-	u32 val = atomic_fetch_add(1<<16, &lock->val);
-	u16 ticket = val >> 16;
-
-	if (ticket == (u16)val)
-		return;
-
-	/*
-	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
-	 * custom cond_read_rcsc() here we just emit a full fence.  We only
-	 * need the prior reads before subsequent writes ordering from
-	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
-	 * have no outstanding writes due to the atomic_fetch_add() the extra
-	 * orderings are free.
-	 */
-	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
-	smp_mb();
-}
-
-static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
-{
-	u32 old = atomic_read(&lock->val);
-
-	if ((old >> 16) != (old & 0xffff))
-		return false;
-
-	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
-}
-
-static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
-	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(&lock->val);
-
-	smp_store_release(ptr, (u16)val + 1);
-}
-
-static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
-{
-	u32 val = lock.val.counter;
-
-	return ((val >> 16) == (val & 0xffff));
-}
-
-static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
-{
-	arch_spinlock_t val = READ_ONCE(*lock);
-
-	return !arch_spin_value_unlocked(val);
-}
-
-static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
-{
-	u32 val = atomic_read(&lock->val);
-
-	return (s16)((val >> 16) - (val & 0xffff)) > 1;
-}
-
+#include <asm-generic/ticket_spinlock.h>
 #include <asm/qrwlock.h>
 
 #endif /* __ASM_GENERIC_SPINLOCK_H */
diff --git a/include/asm-generic/ticket_spinlock.h b/include/asm-generic/ticket_spinlock.h
new file mode 100644
index 000000000000..cfcff22b37b3
--- /dev/null
+++ b/include/asm-generic/ticket_spinlock.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
+ * a full fence after the spin to upgrade the otherwise-RCpc
+ * atomic_cond_read_acquire().
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
+ */
+
+#ifndef __ASM_GENERIC_TICKET_SPINLOCK_H
+#define __ASM_GENERIC_TICKET_SPINLOCK_H
+
+#include <linux/atomic.h>
+#include <asm-generic/spinlock_types.h>
+
+static __always_inline void ticket_spin_lock(arch_spinlock_t *lock)
+{
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
+	u16 ticket = val >> 16;
+
+	if (ticket == (u16)val)
+		return;
+
+	/*
+	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
+	 * custom cond_read_rcsc() here we just emit a full fence.  We only
+	 * need the prior reads before subsequent writes ordering from
+	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
+	 * have no outstanding writes due to the atomic_fetch_add() the extra
+	 * orderings are free.
+	 */
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
+	smp_mb();
+}
+
+static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
+{
+	u32 old = atomic_read(&lock->val);
+
+	if ((old >> 16) != (old & 0xffff))
+		return false;
+
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void ticket_spin_unlock(arch_spinlock_t *lock)
+{
+	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
+	u32 val = atomic_read(&lock->val);
+
+	smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int ticket_spin_value_unlocked(arch_spinlock_t lock)
+{
+	u32 val = lock.val.counter;
+
+	return ((val >> 16) == (val & 0xffff));
+}
+
+static __always_inline int ticket_spin_is_locked(arch_spinlock_t *lock)
+{
+	arch_spinlock_t val = READ_ONCE(*lock);
+
+	return !ticket_spin_value_unlocked(val);
+}
+
+static __always_inline int ticket_spin_is_contended(arch_spinlock_t *lock)
+{
+	u32 val = atomic_read(&lock->val);
+
+	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+/*
+ * Remapping spinlock architecture specific functions to the corresponding
+ * ticket spinlock functions.
+ */
+#define arch_spin_is_locked(l)		ticket_spin_is_locked(l)
+#define arch_spin_is_contended(l)	ticket_spin_is_contended(l)
+#define arch_spin_value_unlocked(l)	ticket_spin_value_unlocked(l)
+#define arch_spin_lock(l)		ticket_spin_lock(l)
+#define arch_spin_trylock(l)		ticket_spin_trylock(l)
+#define arch_spin_unlock(l)		ticket_spin_unlock(l)
+
+#endif /* __ASM_GENERIC_TICKET_SPINLOCK_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 02/19] asm-generic: ticket-lock: Move into ticket_spinlock.h
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Move ticket-lock definition into an independent file. This is the
preparation for the next combo spinlock of riscv.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 include/asm-generic/spinlock.h        |  87 +---------------------
 include/asm-generic/ticket_spinlock.h | 103 ++++++++++++++++++++++++++
 2 files changed, 104 insertions(+), 86 deletions(-)
 create mode 100644 include/asm-generic/ticket_spinlock.h

diff --git a/include/asm-generic/spinlock.h b/include/asm-generic/spinlock.h
index 4773334ee638..970590baf61b 100644
--- a/include/asm-generic/spinlock.h
+++ b/include/asm-generic/spinlock.h
@@ -1,94 +1,9 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
-/*
- * 'Generic' ticket-lock implementation.
- *
- * It relies on atomic_fetch_add() having well defined forward progress
- * guarantees under contention. If your architecture cannot provide this, stick
- * to a test-and-set lock.
- *
- * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
- * sub-word of the value. This is generally true for anything LL/SC although
- * you'd be hard pressed to find anything useful in architecture specifications
- * about this. If your architecture cannot do this you might be better off with
- * a test-and-set.
- *
- * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
- * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
- * a full fence after the spin to upgrade the otherwise-RCpc
- * atomic_cond_read_acquire().
- *
- * The implementation uses smp_cond_load_acquire() to spin, so if the
- * architecture has WFE like instructions to sleep instead of poll for word
- * modifications be sure to implement that (see ARM64 for example).
- *
- */
-
 #ifndef __ASM_GENERIC_SPINLOCK_H
 #define __ASM_GENERIC_SPINLOCK_H
 
-#include <linux/atomic.h>
-#include <asm-generic/spinlock_types.h>
-
-static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
-{
-	u32 val = atomic_fetch_add(1<<16, &lock->val);
-	u16 ticket = val >> 16;
-
-	if (ticket == (u16)val)
-		return;
-
-	/*
-	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
-	 * custom cond_read_rcsc() here we just emit a full fence.  We only
-	 * need the prior reads before subsequent writes ordering from
-	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
-	 * have no outstanding writes due to the atomic_fetch_add() the extra
-	 * orderings are free.
-	 */
-	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
-	smp_mb();
-}
-
-static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
-{
-	u32 old = atomic_read(&lock->val);
-
-	if ((old >> 16) != (old & 0xffff))
-		return false;
-
-	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
-}
-
-static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
-	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
-	u32 val = atomic_read(&lock->val);
-
-	smp_store_release(ptr, (u16)val + 1);
-}
-
-static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
-{
-	u32 val = lock.val.counter;
-
-	return ((val >> 16) == (val & 0xffff));
-}
-
-static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
-{
-	arch_spinlock_t val = READ_ONCE(*lock);
-
-	return !arch_spin_value_unlocked(val);
-}
-
-static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
-{
-	u32 val = atomic_read(&lock->val);
-
-	return (s16)((val >> 16) - (val & 0xffff)) > 1;
-}
-
+#include <asm-generic/ticket_spinlock.h>
 #include <asm/qrwlock.h>
 
 #endif /* __ASM_GENERIC_SPINLOCK_H */
diff --git a/include/asm-generic/ticket_spinlock.h b/include/asm-generic/ticket_spinlock.h
new file mode 100644
index 000000000000..cfcff22b37b3
--- /dev/null
+++ b/include/asm-generic/ticket_spinlock.h
@@ -0,0 +1,103 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is RCsc to create an RCsc hot path, along with
+ * a full fence after the spin to upgrade the otherwise-RCpc
+ * atomic_cond_read_acquire().
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
+ */
+
+#ifndef __ASM_GENERIC_TICKET_SPINLOCK_H
+#define __ASM_GENERIC_TICKET_SPINLOCK_H
+
+#include <linux/atomic.h>
+#include <asm-generic/spinlock_types.h>
+
+static __always_inline void ticket_spin_lock(arch_spinlock_t *lock)
+{
+	u32 val = atomic_fetch_add(1<<16, &lock->val);
+	u16 ticket = val >> 16;
+
+	if (ticket == (u16)val)
+		return;
+
+	/*
+	 * atomic_cond_read_acquire() is RCpc, but rather than defining a
+	 * custom cond_read_rcsc() here we just emit a full fence.  We only
+	 * need the prior reads before subsequent writes ordering from
+	 * smb_mb(), but as atomic_cond_read_acquire() just emits reads and we
+	 * have no outstanding writes due to the atomic_fetch_add() the extra
+	 * orderings are free.
+	 */
+	atomic_cond_read_acquire(&lock->val, ticket == (u16)VAL);
+	smp_mb();
+}
+
+static __always_inline bool ticket_spin_trylock(arch_spinlock_t *lock)
+{
+	u32 old = atomic_read(&lock->val);
+
+	if ((old >> 16) != (old & 0xffff))
+		return false;
+
+	return atomic_try_cmpxchg(&lock->val, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void ticket_spin_unlock(arch_spinlock_t *lock)
+{
+	u16 *ptr = (u16 *)lock + IS_ENABLED(CONFIG_CPU_BIG_ENDIAN);
+	u32 val = atomic_read(&lock->val);
+
+	smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int ticket_spin_value_unlocked(arch_spinlock_t lock)
+{
+	u32 val = lock.val.counter;
+
+	return ((val >> 16) == (val & 0xffff));
+}
+
+static __always_inline int ticket_spin_is_locked(arch_spinlock_t *lock)
+{
+	arch_spinlock_t val = READ_ONCE(*lock);
+
+	return !ticket_spin_value_unlocked(val);
+}
+
+static __always_inline int ticket_spin_is_contended(arch_spinlock_t *lock)
+{
+	u32 val = atomic_read(&lock->val);
+
+	return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+/*
+ * Remapping spinlock architecture specific functions to the corresponding
+ * ticket spinlock functions.
+ */
+#define arch_spin_is_locked(l)		ticket_spin_is_locked(l)
+#define arch_spin_is_contended(l)	ticket_spin_is_contended(l)
+#define arch_spin_value_unlocked(l)	ticket_spin_value_unlocked(l)
+#define arch_spin_lock(l)		ticket_spin_lock(l)
+#define arch_spin_trylock(l)		ticket_spin_trylock(l)
+#define arch_spin_unlock(l)		ticket_spin_unlock(l)
+
+#endif /* __ASM_GENERIC_TICKET_SPINLOCK_H */
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 03/19] riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The early version of T-Head C9xx cores has a store merge buffer
delay problem. The store merge buffer could improve the store queue
performance by merging multi-store requests, but when there are not
continued store requests, the prior single store request would be
waiting in the store queue for a long time. That would cause
significant problems for communication between multi-cores. This
problem was found on sg2042 & th1520 platforms with the qspinlock
lock torture test.

So appending a fence w.o could immediately flush the store merge
buffer and let other cores see the write result.

This will apply the WRITE_ONCE errata to handle the non-standard
behavior via appending a fence w.o instruction for WRITE_ONCE().

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig.errata              | 19 +++++++++++++++++++
 arch/riscv/errata/thead/errata.c       | 20 ++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h   | 13 -------------
 arch/riscv/include/asm/rwonce.h        | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/vendorid_list.h | 14 ++++++++++++++
 include/asm-generic/rwonce.h           |  2 ++
 6 files changed, 79 insertions(+), 13 deletions(-)
 create mode 100644 arch/riscv/include/asm/rwonce.h

diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
index 0c8f4652cd82..4745a5c57e7c 100644
--- a/arch/riscv/Kconfig.errata
+++ b/arch/riscv/Kconfig.errata
@@ -77,4 +77,23 @@ config ERRATA_THEAD_PMU
 
 	  If you don't know what to do here, say "Y".
 
+config ERRATA_THEAD_WRITE_ONCE
+	bool "Apply T-Head WRITE_ONCE errata"
+	depends on ERRATA_THEAD
+	default y
+	help
+	  The early version of T-Head C9xx cores has a store merge buffer
+	  delay problem. The store merge buffer could improve the store queue
+	  performance by merging multi-store requests, but when there are no
+	  continued store requests, the prior single store request would be
+	  waiting in the store queue for a long time. That would cause
+	  significant problems for communication between multi-cores. Appending
+	  a fence w.o could immediately flush the store merge buffer and let
+	  other cores see the write result.
+
+	  This will apply the WRITE_ONCE errata to handle the non-standard
+	  behavior via appending a fence w.o instruction for WRITE_ONCE().
+
+	  If you don't know what to do here, say "Y".
+
 endmenu # "CPU errata selection"
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index be84b14f0118..881729746d2e 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -69,6 +69,23 @@ static bool errata_probe_pmu(unsigned int stage,
 	return true;
 }
 
+static bool errata_probe_write_once(unsigned int stage,
+				    unsigned long arch_id, unsigned long impid)
+{
+	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
+		return false;
+
+	/* target-c9xx cores report arch_id and impid as 0 */
+	if (arch_id != 0 || impid != 0)
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT ||
+	    stage == RISCV_ALTERNATIVES_MODULE)
+		return true;
+
+	return false;
+}
+
 static u32 thead_errata_probe(unsigned int stage,
 			      unsigned long archid, unsigned long impid)
 {
@@ -83,6 +100,9 @@ static u32 thead_errata_probe(unsigned int stage,
 	if (errata_probe_pmu(stage, archid, impid))
 		cpu_req_errata |= BIT(ERRATA_THEAD_PMU);
 
+	if (errata_probe_write_once(stage, archid, impid))
+		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
+
 	return cpu_req_errata;
 }
 
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 712cab7adffe..fbb2b8d39321 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -11,19 +11,6 @@
 #include <asm/hwcap.h>
 #include <asm/vendorid_list.h>
 
-#ifdef CONFIG_ERRATA_SIFIVE
-#define	ERRATA_SIFIVE_CIP_453 0
-#define	ERRATA_SIFIVE_CIP_1200 1
-#define	ERRATA_SIFIVE_NUMBER 2
-#endif
-
-#ifdef CONFIG_ERRATA_THEAD
-#define	ERRATA_THEAD_PBMT 0
-#define	ERRATA_THEAD_CMO 1
-#define	ERRATA_THEAD_PMU 2
-#define	ERRATA_THEAD_NUMBER 3
-#endif
-
 #ifdef __ASSEMBLY__
 
 #define ALT_INSN_FAULT(x)						\
diff --git a/arch/riscv/include/asm/rwonce.h b/arch/riscv/include/asm/rwonce.h
new file mode 100644
index 000000000000..be0b8864969d
--- /dev/null
+++ b/arch/riscv/include/asm/rwonce.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_RWONCE_H
+#define __ASM_RWONCE_H
+
+#include <linux/compiler_types.h>
+#include <asm/alternative-macros.h>
+#include <asm/vendorid_list.h>
+
+#define __WRITE_ONCE(x, val)				\
+do {							\
+	*(volatile typeof(x) *)&(x) = (val);		\
+	asm volatile(ALTERNATIVE(			\
+		__nops(1),				\
+		"fence w, o\n\t",			\
+		THEAD_VENDOR_ID,			\
+		ERRATA_THEAD_WRITE_ONCE,		\
+		CONFIG_ERRATA_THEAD_WRITE_ONCE)		\
+		: : : "memory");			\
+} while (0)
+
+#include <asm-generic/rwonce.h>
+
+#endif	/* __ASM_RWONCE_H */
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index cb89af3f0704..73078cfe4029 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -8,4 +8,18 @@
 #define SIFIVE_VENDOR_ID	0x489
 #define THEAD_VENDOR_ID		0x5b7
 
+#ifdef CONFIG_ERRATA_SIFIVE
+#define	ERRATA_SIFIVE_CIP_453 0
+#define	ERRATA_SIFIVE_CIP_1200 1
+#define	ERRATA_SIFIVE_NUMBER 2
+#endif
+
+#ifdef CONFIG_ERRATA_THEAD
+#define	ERRATA_THEAD_PBMT 0
+#define	ERRATA_THEAD_CMO 1
+#define	ERRATA_THEAD_PMU 2
+#define	ERRATA_THEAD_WRITE_ONCE 3
+#define	ERRATA_THEAD_NUMBER 4
+#endif
+
 #endif
diff --git a/include/asm-generic/rwonce.h b/include/asm-generic/rwonce.h
index 8d0a6280e982..fb07fe8c6e45 100644
--- a/include/asm-generic/rwonce.h
+++ b/include/asm-generic/rwonce.h
@@ -50,10 +50,12 @@
 	__READ_ONCE(x);							\
 })
 
+#ifndef __WRITE_ONCE
 #define __WRITE_ONCE(x, val)						\
 do {									\
 	*(volatile typeof(x) *)&(x) = (val);				\
 } while (0)
+#endif
 
 #define WRITE_ONCE(x, val)						\
 do {									\
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 03/19] riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The early version of T-Head C9xx cores has a store merge buffer
delay problem. The store merge buffer could improve the store queue
performance by merging multi-store requests, but when there are not
continued store requests, the prior single store request would be
waiting in the store queue for a long time. That would cause
significant problems for communication between multi-cores. This
problem was found on sg2042 & th1520 platforms with the qspinlock
lock torture test.

So appending a fence w.o could immediately flush the store merge
buffer and let other cores see the write result.

This will apply the WRITE_ONCE errata to handle the non-standard
behavior via appending a fence w.o instruction for WRITE_ONCE().

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig.errata              | 19 +++++++++++++++++++
 arch/riscv/errata/thead/errata.c       | 20 ++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h   | 13 -------------
 arch/riscv/include/asm/rwonce.h        | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/vendorid_list.h | 14 ++++++++++++++
 include/asm-generic/rwonce.h           |  2 ++
 6 files changed, 79 insertions(+), 13 deletions(-)
 create mode 100644 arch/riscv/include/asm/rwonce.h

diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
index 0c8f4652cd82..4745a5c57e7c 100644
--- a/arch/riscv/Kconfig.errata
+++ b/arch/riscv/Kconfig.errata
@@ -77,4 +77,23 @@ config ERRATA_THEAD_PMU
 
 	  If you don't know what to do here, say "Y".
 
+config ERRATA_THEAD_WRITE_ONCE
+	bool "Apply T-Head WRITE_ONCE errata"
+	depends on ERRATA_THEAD
+	default y
+	help
+	  The early version of T-Head C9xx cores has a store merge buffer
+	  delay problem. The store merge buffer could improve the store queue
+	  performance by merging multi-store requests, but when there are no
+	  continued store requests, the prior single store request would be
+	  waiting in the store queue for a long time. That would cause
+	  significant problems for communication between multi-cores. Appending
+	  a fence w.o could immediately flush the store merge buffer and let
+	  other cores see the write result.
+
+	  This will apply the WRITE_ONCE errata to handle the non-standard
+	  behavior via appending a fence w.o instruction for WRITE_ONCE().
+
+	  If you don't know what to do here, say "Y".
+
 endmenu # "CPU errata selection"
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index be84b14f0118..881729746d2e 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -69,6 +69,23 @@ static bool errata_probe_pmu(unsigned int stage,
 	return true;
 }
 
+static bool errata_probe_write_once(unsigned int stage,
+				    unsigned long arch_id, unsigned long impid)
+{
+	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
+		return false;
+
+	/* target-c9xx cores report arch_id and impid as 0 */
+	if (arch_id != 0 || impid != 0)
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT ||
+	    stage == RISCV_ALTERNATIVES_MODULE)
+		return true;
+
+	return false;
+}
+
 static u32 thead_errata_probe(unsigned int stage,
 			      unsigned long archid, unsigned long impid)
 {
@@ -83,6 +100,9 @@ static u32 thead_errata_probe(unsigned int stage,
 	if (errata_probe_pmu(stage, archid, impid))
 		cpu_req_errata |= BIT(ERRATA_THEAD_PMU);
 
+	if (errata_probe_write_once(stage, archid, impid))
+		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
+
 	return cpu_req_errata;
 }
 
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 712cab7adffe..fbb2b8d39321 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -11,19 +11,6 @@
 #include <asm/hwcap.h>
 #include <asm/vendorid_list.h>
 
-#ifdef CONFIG_ERRATA_SIFIVE
-#define	ERRATA_SIFIVE_CIP_453 0
-#define	ERRATA_SIFIVE_CIP_1200 1
-#define	ERRATA_SIFIVE_NUMBER 2
-#endif
-
-#ifdef CONFIG_ERRATA_THEAD
-#define	ERRATA_THEAD_PBMT 0
-#define	ERRATA_THEAD_CMO 1
-#define	ERRATA_THEAD_PMU 2
-#define	ERRATA_THEAD_NUMBER 3
-#endif
-
 #ifdef __ASSEMBLY__
 
 #define ALT_INSN_FAULT(x)						\
diff --git a/arch/riscv/include/asm/rwonce.h b/arch/riscv/include/asm/rwonce.h
new file mode 100644
index 000000000000..be0b8864969d
--- /dev/null
+++ b/arch/riscv/include/asm/rwonce.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_RWONCE_H
+#define __ASM_RWONCE_H
+
+#include <linux/compiler_types.h>
+#include <asm/alternative-macros.h>
+#include <asm/vendorid_list.h>
+
+#define __WRITE_ONCE(x, val)				\
+do {							\
+	*(volatile typeof(x) *)&(x) = (val);		\
+	asm volatile(ALTERNATIVE(			\
+		__nops(1),				\
+		"fence w, o\n\t",			\
+		THEAD_VENDOR_ID,			\
+		ERRATA_THEAD_WRITE_ONCE,		\
+		CONFIG_ERRATA_THEAD_WRITE_ONCE)		\
+		: : : "memory");			\
+} while (0)
+
+#include <asm-generic/rwonce.h>
+
+#endif	/* __ASM_RWONCE_H */
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index cb89af3f0704..73078cfe4029 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -8,4 +8,18 @@
 #define SIFIVE_VENDOR_ID	0x489
 #define THEAD_VENDOR_ID		0x5b7
 
+#ifdef CONFIG_ERRATA_SIFIVE
+#define	ERRATA_SIFIVE_CIP_453 0
+#define	ERRATA_SIFIVE_CIP_1200 1
+#define	ERRATA_SIFIVE_NUMBER 2
+#endif
+
+#ifdef CONFIG_ERRATA_THEAD
+#define	ERRATA_THEAD_PBMT 0
+#define	ERRATA_THEAD_CMO 1
+#define	ERRATA_THEAD_PMU 2
+#define	ERRATA_THEAD_WRITE_ONCE 3
+#define	ERRATA_THEAD_NUMBER 4
+#endif
+
 #endif
diff --git a/include/asm-generic/rwonce.h b/include/asm-generic/rwonce.h
index 8d0a6280e982..fb07fe8c6e45 100644
--- a/include/asm-generic/rwonce.h
+++ b/include/asm-generic/rwonce.h
@@ -50,10 +50,12 @@
 	__READ_ONCE(x);							\
 })
 
+#ifndef __WRITE_ONCE
 #define __WRITE_ONCE(x, val)						\
 do {									\
 	*(volatile typeof(x) *)&(x) = (val);				\
 } while (0)
+#endif
 
 #define WRITE_ONCE(x, val)						\
 do {									\
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The requirements of qspinlock have been documented by commit:
a8ad07e5240c ("asm-generic: qspinlock: Indicate the use of mixed-size
atomics").

Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which
doesn't satisfy the requirements of qspinlock above, it won't prevent
some riscv vendors from implementing a strong fwd guarantee LR/SC in
microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor
is the one.

We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
test on Fedora & Ubuntu & OpenEuler ... Here is the performance
comparison between qspinlock and ticket_lock on sg2042 (64 cores):

sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
  queued_spinlock 0.5109/0.00
  ticket_spinlock 0.5814/0.00

perf futex/hash (+6.7%):
  queued_spinlock 1444393 operations/sec (+- 0.09%)
  ticket_spinlock 1353215 operations/sec (+- 0.15%)

perf futex/wake-parallel (+8.6%):
  queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
  ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)

perf futex/requeue (+4.2%):
  queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
  ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)

System Benchmarks (+6.4%)
  queued_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
    Double-Precision Whetstone                       55.0     182422.8  33167.8
    Execl Throughput                                 43.0      13116.6   3050.4
    File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
    File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
    Pipe Throughput                               12440.0   23058600.5  18535.9
    Pipe-based Context Switching                   4000.0    2835617.7   7089.0
    Process Creation                                126.0      12537.3    995.0
    Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
    Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
    System Call Overhead                          15000.0   33308301.3  22205.5
                                                                       ========
    System Benchmarks Index Score                                       12426.1

  ticket_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
    Double-Precision Whetstone                       55.0     181921.0  33076.5
    Execl Throughput                                 43.0      12625.1   2936.1
    File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
    File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
    Pipe Throughput                               12440.0   20594018.7  16554.7
    Pipe-based Context Switching                   4000.0    2571117.7   6427.8
    Process Creation                                126.0      10798.4    857.0
    Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
    Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
    System Call Overhead                          15000.0   30766778.4  20511.2
                                                                       ========
    System Benchmarks Index Score                                       11670.7

The qspinlock has a significant improvement on SOPHGO SG2042 64
cores platform than the ticket_lock.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
---
 arch/riscv/Kconfig                | 16 ++++++++++++++++
 arch/riscv/include/asm/Kbuild     |  3 ++-
 arch/riscv/include/asm/cmpxchg.h  | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/spinlock.h | 17 +++++++++++++++++
 4 files changed, 59 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/include/asm/spinlock.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 93ff677d2be5..e89a3bea3dc1 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -438,6 +438,22 @@ config NODES_SHIFT
 	  Specify the maximum number of NUMA Nodes available on the target
 	  system.  Increases memory reserved to accommodate various tables.
 
+choice
+	prompt "RISC-V spinlock type"
+	default RISCV_TICKET_SPINLOCKS
+
+config RISCV_TICKET_SPINLOCKS
+	bool "Using ticket spinlock"
+
+config RISCV_QUEUED_SPINLOCKS
+	bool "Using queued spinlock"
+	depends on SMP && MMU
+	select ARCH_USE_QUEUED_SPINLOCKS
+	help
+	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
+	  Otherwise, stay at ticket-lock.
+endchoice
+
 config RISCV_ALTERNATIVE
 	bool
 	depends on !XIP_KERNEL
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 504f8b7e72d4..a0dc85e4a754 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -2,10 +2,11 @@
 generic-y += early_ioremap.h
 generic-y += flat.h
 generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
 generic-y += parport.h
-generic-y += spinlock.h
 generic-y += spinlock_types.h
 generic-y += qrwlock.h
 generic-y += qrwlock_types.h
+generic-y += qspinlock.h
 generic-y += user.h
 generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 2f4726d3cfcc..d12231d752a4 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -11,12 +11,36 @@
 #include <asm/barrier.h>
 #include <asm/fence.h>
 
+static inline ulong __xchg16_relaxed(ulong new, void *ptr)
+{
+	ulong ret, tmp;
+	ulong shif = ((ulong)ptr & 2) ? 16 : 0;
+	ulong mask = 0xffff << shif;
+	ulong *__ptr = (ulong *)((ulong)ptr & ~2);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	or   %1, %1, %z4\n"
+		"	sc.w %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		: "=&r" (ret), "=&r" (tmp), "+A" (*__ptr)
+		: "rJ" (~mask), "rJ" (new << shif)
+		: "memory");
+
+	return (ulong)((ret & mask) >> shif);
+}
+
 #define __xchg_relaxed(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__xchg16_relaxed((ulong)__new, __ptr);		\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w %0, %2, %1\n"			\
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
new file mode 100644
index 000000000000..c644a92d4548
--- /dev/null
+++ b/arch/riscv/include/asm/spinlock.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_RISCV_SPINLOCK_H
+#define __ASM_RISCV_SPINLOCK_H
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#define _Q_PENDING_LOOPS	(1 << 9)
+#endif
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm/qspinlock.h>
+#include <asm/qrwlock.h>
+#else
+#include <asm-generic/spinlock.h>
+#endif
+
+#endif /* __ASM_RISCV_SPINLOCK_H */
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The requirements of qspinlock have been documented by commit:
a8ad07e5240c ("asm-generic: qspinlock: Indicate the use of mixed-size
atomics").

Although RISC-V ISA gives out a weaker forward guarantee LR/SC, which
doesn't satisfy the requirements of qspinlock above, it won't prevent
some riscv vendors from implementing a strong fwd guarantee LR/SC in
microarchitecture to match xchg_tail requirement. T-HEAD C9xx processor
is the one.

We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
test on Fedora & Ubuntu & OpenEuler ... Here is the performance
comparison between qspinlock and ticket_lock on sg2042 (64 cores):

sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
  queued_spinlock 0.5109/0.00
  ticket_spinlock 0.5814/0.00

perf futex/hash (+6.7%):
  queued_spinlock 1444393 operations/sec (+- 0.09%)
  ticket_spinlock 1353215 operations/sec (+- 0.15%)

perf futex/wake-parallel (+8.6%):
  queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
  ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)

perf futex/requeue (+4.2%):
  queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
  ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)

System Benchmarks (+6.4%)
  queued_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
    Double-Precision Whetstone                       55.0     182422.8  33167.8
    Execl Throughput                                 43.0      13116.6   3050.4
    File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
    File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
    Pipe Throughput                               12440.0   23058600.5  18535.9
    Pipe-based Context Switching                   4000.0    2835617.7   7089.0
    Process Creation                                126.0      12537.3    995.0
    Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
    Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
    System Call Overhead                          15000.0   33308301.3  22205.5
                                                                       ========
    System Benchmarks Index Score                                       12426.1

  ticket_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
    Double-Precision Whetstone                       55.0     181921.0  33076.5
    Execl Throughput                                 43.0      12625.1   2936.1
    File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
    File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
    Pipe Throughput                               12440.0   20594018.7  16554.7
    Pipe-based Context Switching                   4000.0    2571117.7   6427.8
    Process Creation                                126.0      10798.4    857.0
    Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
    Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
    System Call Overhead                          15000.0   30766778.4  20511.2
                                                                       ========
    System Benchmarks Index Score                                       11670.7

The qspinlock has a significant improvement on SOPHGO SG2042 64
cores platform than the ticket_lock.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
---
 arch/riscv/Kconfig                | 16 ++++++++++++++++
 arch/riscv/include/asm/Kbuild     |  3 ++-
 arch/riscv/include/asm/cmpxchg.h  | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/spinlock.h | 17 +++++++++++++++++
 4 files changed, 59 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/include/asm/spinlock.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 93ff677d2be5..e89a3bea3dc1 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -438,6 +438,22 @@ config NODES_SHIFT
 	  Specify the maximum number of NUMA Nodes available on the target
 	  system.  Increases memory reserved to accommodate various tables.
 
+choice
+	prompt "RISC-V spinlock type"
+	default RISCV_TICKET_SPINLOCKS
+
+config RISCV_TICKET_SPINLOCKS
+	bool "Using ticket spinlock"
+
+config RISCV_QUEUED_SPINLOCKS
+	bool "Using queued spinlock"
+	depends on SMP && MMU
+	select ARCH_USE_QUEUED_SPINLOCKS
+	help
+	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
+	  Otherwise, stay at ticket-lock.
+endchoice
+
 config RISCV_ALTERNATIVE
 	bool
 	depends on !XIP_KERNEL
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 504f8b7e72d4..a0dc85e4a754 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -2,10 +2,11 @@
 generic-y += early_ioremap.h
 generic-y += flat.h
 generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
 generic-y += parport.h
-generic-y += spinlock.h
 generic-y += spinlock_types.h
 generic-y += qrwlock.h
 generic-y += qrwlock_types.h
+generic-y += qspinlock.h
 generic-y += user.h
 generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 2f4726d3cfcc..d12231d752a4 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -11,12 +11,36 @@
 #include <asm/barrier.h>
 #include <asm/fence.h>
 
+static inline ulong __xchg16_relaxed(ulong new, void *ptr)
+{
+	ulong ret, tmp;
+	ulong shif = ((ulong)ptr & 2) ? 16 : 0;
+	ulong mask = 0xffff << shif;
+	ulong *__ptr = (ulong *)((ulong)ptr & ~2);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	or   %1, %1, %z4\n"
+		"	sc.w %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		: "=&r" (ret), "=&r" (tmp), "+A" (*__ptr)
+		: "rJ" (~mask), "rJ" (new << shif)
+		: "memory");
+
+	return (ulong)((ret & mask) >> shif);
+}
+
 #define __xchg_relaxed(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__xchg16_relaxed((ulong)__new, __ptr);		\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w %0, %2, %1\n"			\
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
new file mode 100644
index 000000000000..c644a92d4548
--- /dev/null
+++ b/arch/riscv/include/asm/spinlock.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_RISCV_SPINLOCK_H
+#define __ASM_RISCV_SPINLOCK_H
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#define _Q_PENDING_LOOPS	(1 << 9)
+#endif
+
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm/qspinlock.h>
+#include <asm/qrwlock.h>
+#else
+#include <asm-generic/spinlock.h>
+#endif
+
+#endif /* __ASM_RISCV_SPINLOCK_H */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Combo spinlock could support queued and ticket in one Linux Image and
select them during boot time via errata mechanism. Here is the func
size (Bytes) comparison table below:

TYPE			: COMBO | TICKET | QUEUED
arch_spin_lock		: 106	| 60     | 50
arch_spin_unlock	: 54    | 36     | 26
arch_spin_trylock	: 110   | 72     | 54
arch_spin_is_locked	: 48    | 34     | 20
arch_spin_is_contended	: 56    | 40     | 24
rch_spin_value_unlocked	: 48    | 34     | 24

One example of disassemble combo arch_spin_unlock:
   0xffffffff8000409c <+14>:    nop                # detour slot
   0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
   0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
   0xffffffff800040a8 <+26>:    ld      s0,8(sp)
   0xffffffff800040aa <+28>:    addi    sp,sp,16
   0xffffffff800040ac <+30>:    ret
   0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
   0xffffffff800040b0 <+34>:    sext.w  a5,a5
   0xffffffff800040b2 <+36>:    fence   rw,w
   0xffffffff800040b6 <+40>:    addiw   a5,a5,1
   0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
   0xffffffff800040ba <+44>:    srli    a5,a5,0x30
   0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
   0xffffffff800040c0 <+50>:    ld      s0,8(sp)
   0xffffffff800040c2 <+52>:    addi    sp,sp,16
   0xffffffff800040c4 <+54>:    ret

The qspinlock is smaller and faster than ticket-lock when all are in
fast-path, and combo spinlock could provide a compatible Linux Image
for different micro-arch design (weak/strict fwd guarantee) processors.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
---
 arch/riscv/Kconfig                |  9 +++-
 arch/riscv/include/asm/hwcap.h    |  1 +
 arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
 arch/riscv/kernel/cpufeature.c    | 10 ++++
 4 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index e89a3bea3dc1..119e774a3dcf 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -440,7 +440,7 @@ config NODES_SHIFT
 
 choice
 	prompt "RISC-V spinlock type"
-	default RISCV_TICKET_SPINLOCKS
+	default RISCV_COMBO_SPINLOCKS
 
 config RISCV_TICKET_SPINLOCKS
 	bool "Using ticket spinlock"
@@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
 	help
 	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
 	  Otherwise, stay at ticket-lock.
+
+config RISCV_COMBO_SPINLOCKS
+	bool "Using combo spinlock"
+	depends on SMP && MMU
+	select ARCH_USE_QUEUED_SPINLOCKS
+	help
+	  Select queued spinlock or ticket-lock via errata.
 endchoice
 
 config RISCV_ALTERNATIVE
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index f041bfa7f6a0..08ae75a694c2 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -54,6 +54,7 @@
 #define RISCV_ISA_EXT_ZIFENCEI		41
 #define RISCV_ISA_EXT_ZIHPM		42
 
+#define RISCV_ISA_EXT_XTICKETLOCK	63
 #define RISCV_ISA_EXT_MAX		64
 #define RISCV_ISA_EXT_NAME_LEN_MAX	32
 
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index c644a92d4548..9eb3ad31e564 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -7,11 +7,94 @@
 #define _Q_PENDING_LOOPS	(1 << 9)
 #endif
 
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+#include <asm-generic/ticket_spinlock.h>
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
+
+#include <asm-generic/qspinlock.h>
+#include <asm/hwcap.h>
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
+
+#define COMBO_DETOUR				\
+	asm_volatile_goto(ALTERNATIVE(		\
+		"nop",				\
+		"j %l[ticket_spin_lock]",	\
+		0,				\
+		RISCV_ISA_EXT_XTICKETLOCK,	\
+		CONFIG_RISCV_COMBO_SPINLOCKS)	\
+		: : : : ticket_spin_lock);
+
+static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	queued_spin_lock(lock);
+	return;
+ticket_spin_lock:
+	ticket_spin_lock(lock);
+}
+
+static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	return queued_spin_trylock(lock);
+ticket_spin_lock:
+	return ticket_spin_trylock(lock);
+}
+
+static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	queued_spin_unlock(lock);
+	return;
+ticket_spin_lock:
+	ticket_spin_unlock(lock);
+}
+
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+{
+	COMBO_DETOUR
+	return queued_spin_value_unlocked(lock);
+ticket_spin_lock:
+	return ticket_spin_value_unlocked(lock);
+}
+
+static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	return queued_spin_is_locked(lock);
+ticket_spin_lock:
+	return ticket_spin_is_locked(lock);
+}
+
+static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	return queued_spin_is_contended(lock);
+ticket_spin_lock:
+	return ticket_spin_is_contended(lock);
+}
+#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
+
 #ifdef CONFIG_QUEUED_SPINLOCKS
 #include <asm/qspinlock.h>
-#include <asm/qrwlock.h>
 #else
-#include <asm-generic/spinlock.h>
+#include <asm-generic/ticket_spinlock.h>
 #endif
 
+#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
+
+#include <asm/qrwlock.h>
+
 #endif /* __ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index bdcf460ea53d..e65b0e54152d 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
 		set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
 		set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
 
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+		/*
+		 * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
+		 * as default or queued spinlock by choice. Because ticket_lock would dirty
+		 * spinlock value, the only way is to change from queued_spinlock to
+		 * ticket_spinlock, but can not be vice.
+		 */
+		set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
+#endif
+
 		/*
 		 * These ones were as they were part of the base ISA when the
 		 * port & dt-bindings were upstreamed, and so can be set
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Combo spinlock could support queued and ticket in one Linux Image and
select them during boot time via errata mechanism. Here is the func
size (Bytes) comparison table below:

TYPE			: COMBO | TICKET | QUEUED
arch_spin_lock		: 106	| 60     | 50
arch_spin_unlock	: 54    | 36     | 26
arch_spin_trylock	: 110   | 72     | 54
arch_spin_is_locked	: 48    | 34     | 20
arch_spin_is_contended	: 56    | 40     | 24
rch_spin_value_unlocked	: 48    | 34     | 24

One example of disassemble combo arch_spin_unlock:
   0xffffffff8000409c <+14>:    nop                # detour slot
   0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
   0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
   0xffffffff800040a8 <+26>:    ld      s0,8(sp)
   0xffffffff800040aa <+28>:    addi    sp,sp,16
   0xffffffff800040ac <+30>:    ret
   0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
   0xffffffff800040b0 <+34>:    sext.w  a5,a5
   0xffffffff800040b2 <+36>:    fence   rw,w
   0xffffffff800040b6 <+40>:    addiw   a5,a5,1
   0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
   0xffffffff800040ba <+44>:    srli    a5,a5,0x30
   0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
   0xffffffff800040c0 <+50>:    ld      s0,8(sp)
   0xffffffff800040c2 <+52>:    addi    sp,sp,16
   0xffffffff800040c4 <+54>:    ret

The qspinlock is smaller and faster than ticket-lock when all are in
fast-path, and combo spinlock could provide a compatible Linux Image
for different micro-arch design (weak/strict fwd guarantee) processors.

Signed-off-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
---
 arch/riscv/Kconfig                |  9 +++-
 arch/riscv/include/asm/hwcap.h    |  1 +
 arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
 arch/riscv/kernel/cpufeature.c    | 10 ++++
 4 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index e89a3bea3dc1..119e774a3dcf 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -440,7 +440,7 @@ config NODES_SHIFT
 
 choice
 	prompt "RISC-V spinlock type"
-	default RISCV_TICKET_SPINLOCKS
+	default RISCV_COMBO_SPINLOCKS
 
 config RISCV_TICKET_SPINLOCKS
 	bool "Using ticket spinlock"
@@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
 	help
 	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
 	  Otherwise, stay at ticket-lock.
+
+config RISCV_COMBO_SPINLOCKS
+	bool "Using combo spinlock"
+	depends on SMP && MMU
+	select ARCH_USE_QUEUED_SPINLOCKS
+	help
+	  Select queued spinlock or ticket-lock via errata.
 endchoice
 
 config RISCV_ALTERNATIVE
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index f041bfa7f6a0..08ae75a694c2 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -54,6 +54,7 @@
 #define RISCV_ISA_EXT_ZIFENCEI		41
 #define RISCV_ISA_EXT_ZIHPM		42
 
+#define RISCV_ISA_EXT_XTICKETLOCK	63
 #define RISCV_ISA_EXT_MAX		64
 #define RISCV_ISA_EXT_NAME_LEN_MAX	32
 
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index c644a92d4548..9eb3ad31e564 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -7,11 +7,94 @@
 #define _Q_PENDING_LOOPS	(1 << 9)
 #endif
 
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+#include <asm-generic/ticket_spinlock.h>
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
+
+#include <asm-generic/qspinlock.h>
+#include <asm/hwcap.h>
+
+#undef arch_spin_is_locked
+#undef arch_spin_is_contended
+#undef arch_spin_value_unlocked
+#undef arch_spin_lock
+#undef arch_spin_trylock
+#undef arch_spin_unlock
+
+#define COMBO_DETOUR				\
+	asm_volatile_goto(ALTERNATIVE(		\
+		"nop",				\
+		"j %l[ticket_spin_lock]",	\
+		0,				\
+		RISCV_ISA_EXT_XTICKETLOCK,	\
+		CONFIG_RISCV_COMBO_SPINLOCKS)	\
+		: : : : ticket_spin_lock);
+
+static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	queued_spin_lock(lock);
+	return;
+ticket_spin_lock:
+	ticket_spin_lock(lock);
+}
+
+static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	return queued_spin_trylock(lock);
+ticket_spin_lock:
+	return ticket_spin_trylock(lock);
+}
+
+static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	queued_spin_unlock(lock);
+	return;
+ticket_spin_lock:
+	ticket_spin_unlock(lock);
+}
+
+static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
+{
+	COMBO_DETOUR
+	return queued_spin_value_unlocked(lock);
+ticket_spin_lock:
+	return ticket_spin_value_unlocked(lock);
+}
+
+static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	return queued_spin_is_locked(lock);
+ticket_spin_lock:
+	return ticket_spin_is_locked(lock);
+}
+
+static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
+{
+	COMBO_DETOUR
+	return queued_spin_is_contended(lock);
+ticket_spin_lock:
+	return ticket_spin_is_contended(lock);
+}
+#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
+
 #ifdef CONFIG_QUEUED_SPINLOCKS
 #include <asm/qspinlock.h>
-#include <asm/qrwlock.h>
 #else
-#include <asm-generic/spinlock.h>
+#include <asm-generic/ticket_spinlock.h>
 #endif
 
+#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
+
+#include <asm/qrwlock.h>
+
 #endif /* __ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index bdcf460ea53d..e65b0e54152d 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
 		set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
 		set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
 
+#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
+		/*
+		 * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
+		 * as default or queued spinlock by choice. Because ticket_lock would dirty
+		 * spinlock value, the only way is to change from queued_spinlock to
+		 * ticket_spinlock, but can not be vice.
+		 */
+		set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
+#endif
+
 		/*
 		 * These ones were as they were part of the base ISA when the
 		 * port & dt-bindings were upstreamed, and so can be set
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 06/19] riscv: qspinlock: Allow force qspinlock from the command line
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Allow cmdline to force the kernel to use queued_spinlock when
CONFIG_RISCV_COMBO_SPINLOCKS=y.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/riscv/include/asm/cpufeature.h             |  2 ++
 arch/riscv/kernel/cpufeature.c                  | 15 ++++++++++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index be40bfbf4380..de6b7ee752cd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4666,6 +4666,9 @@
 
 	quiet		[KNL] Disable most log messages
 
+	qspinlock	[RISCV] Forces kernel to use queued_spinlock when
+			CONFIG_RISCV_COMBO_SPINLOCKS=y.
+
 	r128=		[HW,DRM]
 
 	radix_hcall_invalidate=on  [PPC/PSERIES]
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index 23fed53b8815..2bf0343661da 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -30,4 +30,6 @@ DECLARE_PER_CPU(long, misaligned_access_speed);
 /* Per-cpu ISA extensions. */
 extern struct riscv_isainfo hart_isa[NR_CPUS];
 
+extern bool force_qspinlock;
+
 #endif
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index e65b0e54152d..f8dbbe1bbd34 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -99,6 +99,17 @@ static bool riscv_isa_extension_check(int id)
 	return true;
 }
 
+#ifdef CONFIG_QUEUED_SPINLOCKS
+bool force_qspinlock = false;
+static int __init force_queued_spinlock(char *p)
+{
+	force_qspinlock = true;
+	pr_info("Force kernel to use queued_spinlock");
+	return 0;
+}
+early_param("qspinlock", force_queued_spinlock);
+#endif
+
 void __init riscv_fill_hwcap(void)
 {
 	struct device_node *node;
@@ -331,7 +342,9 @@ void __init riscv_fill_hwcap(void)
 		 * spinlock value, the only way is to change from queued_spinlock to
 		 * ticket_spinlock, but can not be vice.
 		 */
-		set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
+		if (!force_qspinlock) {
+			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
+		}
 #endif
 
 		/*
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 06/19] riscv: qspinlock: Allow force qspinlock from the command line
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Allow cmdline to force the kernel to use queued_spinlock when
CONFIG_RISCV_COMBO_SPINLOCKS=y.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 arch/riscv/include/asm/cpufeature.h             |  2 ++
 arch/riscv/kernel/cpufeature.c                  | 15 ++++++++++++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index be40bfbf4380..de6b7ee752cd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4666,6 +4666,9 @@
 
 	quiet		[KNL] Disable most log messages
 
+	qspinlock	[RISCV] Forces kernel to use queued_spinlock when
+			CONFIG_RISCV_COMBO_SPINLOCKS=y.
+
 	r128=		[HW,DRM]
 
 	radix_hcall_invalidate=on  [PPC/PSERIES]
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h
index 23fed53b8815..2bf0343661da 100644
--- a/arch/riscv/include/asm/cpufeature.h
+++ b/arch/riscv/include/asm/cpufeature.h
@@ -30,4 +30,6 @@ DECLARE_PER_CPU(long, misaligned_access_speed);
 /* Per-cpu ISA extensions. */
 extern struct riscv_isainfo hart_isa[NR_CPUS];
 
+extern bool force_qspinlock;
+
 #endif
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index e65b0e54152d..f8dbbe1bbd34 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -99,6 +99,17 @@ static bool riscv_isa_extension_check(int id)
 	return true;
 }
 
+#ifdef CONFIG_QUEUED_SPINLOCKS
+bool force_qspinlock = false;
+static int __init force_queued_spinlock(char *p)
+{
+	force_qspinlock = true;
+	pr_info("Force kernel to use queued_spinlock");
+	return 0;
+}
+early_param("qspinlock", force_queued_spinlock);
+#endif
+
 void __init riscv_fill_hwcap(void)
 {
 	struct device_node *node;
@@ -331,7 +342,9 @@ void __init riscv_fill_hwcap(void)
 		 * spinlock value, the only way is to change from queued_spinlock to
 		 * ticket_spinlock, but can not be vice.
 		 */
-		set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
+		if (!force_qspinlock) {
+			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
+		}
 #endif
 
 		/*
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

According to qspinlock requirements, RISC-V gives out a weak LR/SC
forward progress guarantee which does not satisfy qspinlock. But
many vendors could produce stronger forward guarantee LR/SC to
ensure the xchg_tail could be finished in time on any kind of
hart. T-HEAD is the vendor which implements strong forward
guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
with errata help.

T-HEAD early version of processors has the merge buffer delay
problem, so we need ERRATA_WRITEONCE to support qspinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig.errata              | 13 +++++++++++++
 arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
 arch/riscv/include/asm/vendorid_list.h |  3 ++-
 arch/riscv/kernel/cpufeature.c         |  3 ++-
 5 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
index 4745a5c57e7c..eb43677b13cc 100644
--- a/arch/riscv/Kconfig.errata
+++ b/arch/riscv/Kconfig.errata
@@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
 
 	  If you don't know what to do here, say "Y".
 
+config ERRATA_THEAD_QSPINLOCK
+	bool "Apply T-Head queued spinlock errata"
+	depends on ERRATA_THEAD
+	default y
+	help
+	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
+	  match the xchg_tail requirement of qspinlock.
+
+	  This will apply the QSPINLOCK errata to handle the non-standard
+	  behavior via using qspinlock instead of ticket_lock.
+
+	  If you don't know what to do here, say "Y".
+
 endmenu # "CPU errata selection"
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index 881729746d2e..d560dc45c0e7 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
 	return false;
 }
 
+static bool errata_probe_qspinlock(unsigned int stage,
+				   unsigned long arch_id, unsigned long impid)
+{
+	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
+		return false;
+
+	/*
+	 * The queued_spinlock torture would get in livelock without
+	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
+	 * processors.
+	 */
+	if (arch_id == 0 && impid == 0 &&
+	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
+		return true;
+
+	return false;
+}
+
 static u32 thead_errata_probe(unsigned int stage,
 			      unsigned long archid, unsigned long impid)
 {
@@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
 	if (errata_probe_write_once(stage, archid, impid))
 		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
 
+	if (errata_probe_qspinlock(stage, archid, impid))
+		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
+
 	return cpu_req_errata;
 }
 
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index fbb2b8d39321..a696d18d1b0d 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
 	: "=r" (__ovl) :						\
 	: "memory")
 
+static __always_inline bool
+riscv_has_errata_thead_qspinlock(void)
+{
+	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+		asm_volatile_goto(
+		ALTERNATIVE(
+		"j	%l[l_no]", "nop",
+		THEAD_VENDOR_ID,
+		ERRATA_THEAD_QSPINLOCK,
+		CONFIG_ERRATA_THEAD_QSPINLOCK)
+		: : : : l_no);
+	} else {
+		goto l_no;
+	}
+
+	return true;
+l_no:
+	return false;
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index 73078cfe4029..1f1d03877f5f 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -19,7 +19,8 @@
 #define	ERRATA_THEAD_CMO 1
 #define	ERRATA_THEAD_PMU 2
 #define	ERRATA_THEAD_WRITE_ONCE 3
-#define	ERRATA_THEAD_NUMBER 4
+#define	ERRATA_THEAD_QSPINLOCK 4
+#define	ERRATA_THEAD_NUMBER 5
 #endif
 
 #endif
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index f8dbbe1bbd34..d9694fe40a9a 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
 		 * spinlock value, the only way is to change from queued_spinlock to
 		 * ticket_spinlock, but can not be vice.
 		 */
-		if (!force_qspinlock) {
+		if (!force_qspinlock &&
+		    !riscv_has_errata_thead_qspinlock()) {
 			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
 		}
 #endif
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

According to qspinlock requirements, RISC-V gives out a weak LR/SC
forward progress guarantee which does not satisfy qspinlock. But
many vendors could produce stronger forward guarantee LR/SC to
ensure the xchg_tail could be finished in time on any kind of
hart. T-HEAD is the vendor which implements strong forward
guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
with errata help.

T-HEAD early version of processors has the merge buffer delay
problem, so we need ERRATA_WRITEONCE to support qspinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig.errata              | 13 +++++++++++++
 arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
 arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
 arch/riscv/include/asm/vendorid_list.h |  3 ++-
 arch/riscv/kernel/cpufeature.c         |  3 ++-
 5 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
index 4745a5c57e7c..eb43677b13cc 100644
--- a/arch/riscv/Kconfig.errata
+++ b/arch/riscv/Kconfig.errata
@@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
 
 	  If you don't know what to do here, say "Y".
 
+config ERRATA_THEAD_QSPINLOCK
+	bool "Apply T-Head queued spinlock errata"
+	depends on ERRATA_THEAD
+	default y
+	help
+	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
+	  match the xchg_tail requirement of qspinlock.
+
+	  This will apply the QSPINLOCK errata to handle the non-standard
+	  behavior via using qspinlock instead of ticket_lock.
+
+	  If you don't know what to do here, say "Y".
+
 endmenu # "CPU errata selection"
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index 881729746d2e..d560dc45c0e7 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
 	return false;
 }
 
+static bool errata_probe_qspinlock(unsigned int stage,
+				   unsigned long arch_id, unsigned long impid)
+{
+	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
+		return false;
+
+	/*
+	 * The queued_spinlock torture would get in livelock without
+	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
+	 * processors.
+	 */
+	if (arch_id == 0 && impid == 0 &&
+	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
+		return false;
+
+	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
+		return true;
+
+	return false;
+}
+
 static u32 thead_errata_probe(unsigned int stage,
 			      unsigned long archid, unsigned long impid)
 {
@@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
 	if (errata_probe_write_once(stage, archid, impid))
 		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
 
+	if (errata_probe_qspinlock(stage, archid, impid))
+		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
+
 	return cpu_req_errata;
 }
 
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index fbb2b8d39321..a696d18d1b0d 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
 	: "=r" (__ovl) :						\
 	: "memory")
 
+static __always_inline bool
+riscv_has_errata_thead_qspinlock(void)
+{
+	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+		asm_volatile_goto(
+		ALTERNATIVE(
+		"j	%l[l_no]", "nop",
+		THEAD_VENDOR_ID,
+		ERRATA_THEAD_QSPINLOCK,
+		CONFIG_ERRATA_THEAD_QSPINLOCK)
+		: : : : l_no);
+	} else {
+		goto l_no;
+	}
+
+	return true;
+l_no:
+	return false;
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index 73078cfe4029..1f1d03877f5f 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -19,7 +19,8 @@
 #define	ERRATA_THEAD_CMO 1
 #define	ERRATA_THEAD_PMU 2
 #define	ERRATA_THEAD_WRITE_ONCE 3
-#define	ERRATA_THEAD_NUMBER 4
+#define	ERRATA_THEAD_QSPINLOCK 4
+#define	ERRATA_THEAD_NUMBER 5
 #endif
 
 #endif
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index f8dbbe1bbd34..d9694fe40a9a 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
 		 * spinlock value, the only way is to change from queued_spinlock to
 		 * ticket_spinlock, but can not be vice.
 		 */
-		if (!force_qspinlock) {
+		if (!force_qspinlock &&
+		    !riscv_has_errata_thead_qspinlock()) {
 			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
 		}
 #endif
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 08/19] riscv: qspinlock: Use new static key for controlling call of virt_spin_lock()
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add a static key controlling whether virt_spin_lock() should be
called or not. When running on bare metal set the new key to
false.

The KVM guests fall back to a Test-and-Set spinlock, because fair
locks have horrible lock 'holder' preemption issues. The
virt_spin_lock_key would shortcut for the
queued_spin_lock_slowpath() function that allow virt_spin_lock to
hijack it.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig                |  1 +
 arch/riscv/include/asm/sbi.h      |  8 ++++++++
 arch/riscv/include/asm/spinlock.h | 22 ++++++++++++++++++++++
 arch/riscv/kernel/cpufeature.c    |  4 +++-
 arch/riscv/kernel/sbi.c           |  2 +-
 arch/riscv/kernel/setup.c         | 19 +++++++++++++++++++
 6 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 119e774a3dcf..42ae45c42b4d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -20,6 +20,7 @@ config RISCV
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_CURRENT_STACK_POINTER
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_WX
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index e1523c8624cc..b7ced34b79a3 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -51,6 +51,13 @@ enum sbi_ext_base_fid {
 	SBI_EXT_BASE_GET_MIMPID,
 };
 
+enum sbi_ext_base_impl_id {
+	SBI_EXT_BASE_IMPL_ID_BBL = 0,
+	SBI_EXT_BASE_IMPL_ID_OPENSBI,
+	SBI_EXT_BASE_IMPL_ID_XVISOR,
+	SBI_EXT_BASE_IMPL_ID_KVM,
+};
+
 enum sbi_ext_time_fid {
 	SBI_EXT_TIME_SET_TIMER = 0,
 };
@@ -286,6 +293,7 @@ int sbi_console_getchar(void);
 long sbi_get_mvendorid(void);
 long sbi_get_marchid(void);
 long sbi_get_mimpid(void);
+long sbi_get_firmware_id(void);
 void sbi_set_timer(uint64_t stime_value);
 void sbi_shutdown(void);
 void sbi_send_ipi(unsigned int cpu);
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index 9eb3ad31e564..13f3e14500c0 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -4,6 +4,28 @@
 #define __ASM_RISCV_SPINLOCK_H
 
 #ifdef CONFIG_QUEUED_SPINLOCKS
+/*
+ * The KVM guests fall back to a Test-and-Set spinlock, because fair locks
+ * have horrible lock 'holder' preemption issues. The virt_spin_lock_key
+ * would shortcut for the queued_spin_lock_slowpath() function that allow
+ * virt_spin_lock to hijack it.
+ */
+DECLARE_STATIC_KEY_TRUE(virt_spin_lock_key);
+
+#define virt_spin_lock virt_spin_lock
+static inline bool virt_spin_lock(struct qspinlock *lock)
+{
+	if (!static_branch_likely(&virt_spin_lock_key))
+		return false;
+
+	do {
+		while (atomic_read(&lock->val) != 0)
+			cpu_relax();
+	} while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0);
+
+	return true;
+}
+
 #define _Q_PENDING_LOOPS	(1 << 9)
 #endif
 
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index d9694fe40a9a..26826aa590e9 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -21,6 +21,7 @@
 #include <asm/hwcap.h>
 #include <asm/patch.h>
 #include <asm/processor.h>
+#include <asm/sbi.h>
 #include <asm/vector.h>
 
 #define NUM_ALPHA_EXTS ('z' - 'a' + 1)
@@ -343,7 +344,8 @@ void __init riscv_fill_hwcap(void)
 		 * ticket_spinlock, but can not be vice.
 		 */
 		if (!force_qspinlock &&
-		    !riscv_has_errata_thead_qspinlock()) {
+		    !riscv_has_errata_thead_qspinlock() &&
+		    (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
 			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
 		}
 #endif
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index c672c8ba9a2a..398b768a02e6 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -555,7 +555,7 @@ static inline long sbi_get_spec_version(void)
 	return __sbi_base_ecall(SBI_EXT_BASE_GET_SPEC_VERSION);
 }
 
-static inline long sbi_get_firmware_id(void)
+long sbi_get_firmware_id(void)
 {
 	return __sbi_base_ecall(SBI_EXT_BASE_GET_IMP_ID);
 }
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 971fe776e2f8..def89fd8ea55 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -26,6 +26,7 @@
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/cpufeature.h>
 #include <asm/early_ioremap.h>
 #include <asm/pgtable.h>
 #include <asm/setup.h>
@@ -264,6 +265,19 @@ static void __init parse_dtb(void)
 #endif
 }
 
+#ifdef CONFIG_QUEUED_SPINLOCKS
+DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
+
+static void __init virt_spin_lock_init(void)
+{
+	if (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM ||
+	    force_qspinlock)
+		static_branch_disable(&virt_spin_lock_key);
+}
+#else
+static void __init virt_spin_lock_init(void) {}
+#endif
+
 extern void __init init_rt_signal_env(void);
 
 void __init setup_arch(char **cmdline_p)
@@ -313,6 +327,11 @@ void __init setup_arch(char **cmdline_p)
 		riscv_noncoherent_supported();
 }
 
+void __init arch_cpu_finalize_init(void)
+{
+	virt_spin_lock_init();
+}
+
 static int __init topology_init(void)
 {
 	int i, ret;
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 08/19] riscv: qspinlock: Use new static key for controlling call of virt_spin_lock()
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add a static key controlling whether virt_spin_lock() should be
called or not. When running on bare metal set the new key to
false.

The KVM guests fall back to a Test-and-Set spinlock, because fair
locks have horrible lock 'holder' preemption issues. The
virt_spin_lock_key would shortcut for the
queued_spin_lock_slowpath() function that allow virt_spin_lock to
hijack it.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig                |  1 +
 arch/riscv/include/asm/sbi.h      |  8 ++++++++
 arch/riscv/include/asm/spinlock.h | 22 ++++++++++++++++++++++
 arch/riscv/kernel/cpufeature.c    |  4 +++-
 arch/riscv/kernel/sbi.c           |  2 +-
 arch/riscv/kernel/setup.c         | 19 +++++++++++++++++++
 6 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 119e774a3dcf..42ae45c42b4d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -20,6 +20,7 @@ config RISCV
 	select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE
 	select ARCH_HAS_BINFMT_FLAT
 	select ARCH_HAS_CURRENT_STACK_POINTER
+	select ARCH_HAS_CPU_FINALIZE_INIT
 	select ARCH_HAS_DEBUG_VIRTUAL if MMU
 	select ARCH_HAS_DEBUG_VM_PGTABLE
 	select ARCH_HAS_DEBUG_WX
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index e1523c8624cc..b7ced34b79a3 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -51,6 +51,13 @@ enum sbi_ext_base_fid {
 	SBI_EXT_BASE_GET_MIMPID,
 };
 
+enum sbi_ext_base_impl_id {
+	SBI_EXT_BASE_IMPL_ID_BBL = 0,
+	SBI_EXT_BASE_IMPL_ID_OPENSBI,
+	SBI_EXT_BASE_IMPL_ID_XVISOR,
+	SBI_EXT_BASE_IMPL_ID_KVM,
+};
+
 enum sbi_ext_time_fid {
 	SBI_EXT_TIME_SET_TIMER = 0,
 };
@@ -286,6 +293,7 @@ int sbi_console_getchar(void);
 long sbi_get_mvendorid(void);
 long sbi_get_marchid(void);
 long sbi_get_mimpid(void);
+long sbi_get_firmware_id(void);
 void sbi_set_timer(uint64_t stime_value);
 void sbi_shutdown(void);
 void sbi_send_ipi(unsigned int cpu);
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index 9eb3ad31e564..13f3e14500c0 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -4,6 +4,28 @@
 #define __ASM_RISCV_SPINLOCK_H
 
 #ifdef CONFIG_QUEUED_SPINLOCKS
+/*
+ * The KVM guests fall back to a Test-and-Set spinlock, because fair locks
+ * have horrible lock 'holder' preemption issues. The virt_spin_lock_key
+ * would shortcut for the queued_spin_lock_slowpath() function that allow
+ * virt_spin_lock to hijack it.
+ */
+DECLARE_STATIC_KEY_TRUE(virt_spin_lock_key);
+
+#define virt_spin_lock virt_spin_lock
+static inline bool virt_spin_lock(struct qspinlock *lock)
+{
+	if (!static_branch_likely(&virt_spin_lock_key))
+		return false;
+
+	do {
+		while (atomic_read(&lock->val) != 0)
+			cpu_relax();
+	} while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0);
+
+	return true;
+}
+
 #define _Q_PENDING_LOOPS	(1 << 9)
 #endif
 
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index d9694fe40a9a..26826aa590e9 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -21,6 +21,7 @@
 #include <asm/hwcap.h>
 #include <asm/patch.h>
 #include <asm/processor.h>
+#include <asm/sbi.h>
 #include <asm/vector.h>
 
 #define NUM_ALPHA_EXTS ('z' - 'a' + 1)
@@ -343,7 +344,8 @@ void __init riscv_fill_hwcap(void)
 		 * ticket_spinlock, but can not be vice.
 		 */
 		if (!force_qspinlock &&
-		    !riscv_has_errata_thead_qspinlock()) {
+		    !riscv_has_errata_thead_qspinlock() &&
+		    (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
 			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
 		}
 #endif
diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c
index c672c8ba9a2a..398b768a02e6 100644
--- a/arch/riscv/kernel/sbi.c
+++ b/arch/riscv/kernel/sbi.c
@@ -555,7 +555,7 @@ static inline long sbi_get_spec_version(void)
 	return __sbi_base_ecall(SBI_EXT_BASE_GET_SPEC_VERSION);
 }
 
-static inline long sbi_get_firmware_id(void)
+long sbi_get_firmware_id(void)
 {
 	return __sbi_base_ecall(SBI_EXT_BASE_GET_IMP_ID);
 }
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 971fe776e2f8..def89fd8ea55 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -26,6 +26,7 @@
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpu_ops.h>
+#include <asm/cpufeature.h>
 #include <asm/early_ioremap.h>
 #include <asm/pgtable.h>
 #include <asm/setup.h>
@@ -264,6 +265,19 @@ static void __init parse_dtb(void)
 #endif
 }
 
+#ifdef CONFIG_QUEUED_SPINLOCKS
+DEFINE_STATIC_KEY_TRUE(virt_spin_lock_key);
+
+static void __init virt_spin_lock_init(void)
+{
+	if (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM ||
+	    force_qspinlock)
+		static_branch_disable(&virt_spin_lock_key);
+}
+#else
+static void __init virt_spin_lock_init(void) {}
+#endif
+
 extern void __init init_rt_signal_env(void);
 
 void __init setup_arch(char **cmdline_p)
@@ -313,6 +327,11 @@ void __init setup_arch(char **cmdline_p)
 		riscv_noncoherent_supported();
 }
 
+void __init arch_cpu_finalize_init(void)
+{
+	virt_spin_lock_init();
+}
+
 static int __init topology_init(void)
 {
 	int i, ret;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 09/19] RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Using static_call to switch between:
  native_queued_spin_lock_slowpath()    __pv_queued_spin_lock_slowpath()
  native_queued_spin_unlock()           __pv_queued_spin_unlock()

Finish the pv_wait implementation, but pv_kick needs the SBI
definition of the next patches.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/Kbuild               |  1 -
 arch/riscv/include/asm/paravirt.h           | 20 +++++++++
 arch/riscv/include/asm/qspinlock.h          | 29 ++++++++++++
 arch/riscv/include/asm/qspinlock_paravirt.h |  7 +++
 arch/riscv/include/asm/spinlock.h           |  2 +-
 arch/riscv/kernel/paravirt.c                | 50 +++++++++++++++++++++
 arch/riscv/kernel/setup.c                   |  3 ++
 7 files changed, 110 insertions(+), 2 deletions(-)
 create mode 100644 arch/riscv/include/asm/qspinlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h

diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index a0dc85e4a754..b89cb3b73c13 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -7,6 +7,5 @@ generic-y += parport.h
 generic-y += spinlock_types.h
 generic-y += qrwlock.h
 generic-y += qrwlock_types.h
-generic-y += qspinlock.h
 generic-y += user.h
 generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/paravirt.h b/arch/riscv/include/asm/paravirt.h
index 10ba3d6bae4f..ed7eebbedae8 100644
--- a/arch/riscv/include/asm/paravirt.h
+++ b/arch/riscv/include/asm/paravirt.h
@@ -26,4 +26,24 @@ int __init pv_time_init(void);
 
 #endif // CONFIG_PARAVIRT
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+
+void pv_wait(u8 *ptr, u8 val);
+void pv_kick(int cpu);
+
+void dummy_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+void dummy_queued_spin_unlock(struct qspinlock *lock);
+
+DECLARE_STATIC_CALL(pv_queued_spin_lock_slowpath, dummy_queued_spin_lock_slowpath);
+DECLARE_STATIC_CALL(pv_queued_spin_unlock, dummy_queued_spin_unlock);
+
+void __init pv_qspinlock_init(void);
+
+static inline bool pv_is_native_spin_unlock(void)
+{
+	return false;
+}
+
+#endif /* CONFIG_PARAVIRT_SPINLOCKS */
+
 #endif
diff --git a/arch/riscv/include/asm/qspinlock.h b/arch/riscv/include/asm/qspinlock.h
new file mode 100644
index 000000000000..003e9560a0d1
--- /dev/null
+++ b/arch/riscv/include/asm/qspinlock.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QSPINLOCK_H
+#define _ASM_RISCV_QSPINLOCK_H
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#include <asm/paravirt.h>
+
+/* How long a lock should spin before we consider blocking */
+#define SPIN_THRESHOLD		(1 << 15)
+
+void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+void __pv_init_lock_hash(void);
+void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+
+static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
+{
+	static_call(pv_queued_spin_lock_slowpath)(lock, val);
+}
+
+#define queued_spin_unlock	queued_spin_unlock
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+	static_call(pv_queued_spin_unlock)(lock);
+}
+#endif /* CONFIG_PARAVIRT_SPINLOCKS */
+
+#include <asm-generic/qspinlock.h>
+
+#endif /* _ASM_RISCV_QSPINLOCK_H */
diff --git a/arch/riscv/include/asm/qspinlock_paravirt.h b/arch/riscv/include/asm/qspinlock_paravirt.h
new file mode 100644
index 000000000000..ff52b41d8288
--- /dev/null
+++ b/arch/riscv/include/asm/qspinlock_paravirt.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QSPINLOCK_PARAVIRT_H
+#define _ASM_RISCV_QSPINLOCK_PARAVIRT_H
+
+void __pv_queued_spin_unlock(struct qspinlock *lock);
+
+#endif /* _ASM_RISCV_QSPINLOCK_PARAVIRT_H */
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index 13f3e14500c0..a8ba39e5f8dd 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -39,7 +39,7 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
 #undef arch_spin_trylock
 #undef arch_spin_unlock
 
-#include <asm-generic/qspinlock.h>
+#include <asm/qspinlock.h>
 #include <asm/hwcap.h>
 
 #undef arch_spin_is_locked
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index 35816fc10470..1bacb2cf3872 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -130,3 +130,53 @@ int __init pv_time_init(void)
 
 	return 0;
 }
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#include <asm/qspinlock_paravirt.h>
+
+void pv_kick(int cpu)
+{
+	return;
+}
+
+void pv_wait(u8 *ptr, u8 val)
+{
+	unsigned long flags;
+
+	if (in_nmi())
+		return;
+
+	local_irq_save(flags);
+	if (READ_ONCE(*ptr) != val)
+		goto out;
+
+	/* wait_for_interrupt(); */
+out:
+	local_irq_restore(flags);
+}
+
+static void native_queued_spin_unlock(struct qspinlock *lock)
+{
+	smp_store_release(&lock->locked, 0);
+}
+
+DEFINE_STATIC_CALL(pv_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath);
+DEFINE_STATIC_CALL(pv_queued_spin_unlock, native_queued_spin_unlock);
+EXPORT_SYMBOL(__SCK__pv_queued_spin_lock_slowpath);
+EXPORT_SYMBOL(__SCK__pv_queued_spin_unlock);
+
+void __init pv_qspinlock_init(void)
+{
+	if (num_possible_cpus() == 1)
+		return;
+
+	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)
+		return;
+
+	pr_info("PV qspinlocks enabled\n");
+	__pv_init_lock_hash();
+
+	static_call_update(pv_queued_spin_lock_slowpath, __pv_queued_spin_lock_slowpath);
+	static_call_update(pv_queued_spin_unlock, __pv_queued_spin_unlock);
+}
+#endif
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index def89fd8ea55..40f5b9402562 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -329,6 +329,9 @@ void __init setup_arch(char **cmdline_p)
 
 void __init arch_cpu_finalize_init(void)
 {
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+	pv_qspinlock_init();
+#endif
 	virt_spin_lock_init();
 }
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 09/19] RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Using static_call to switch between:
  native_queued_spin_lock_slowpath()    __pv_queued_spin_lock_slowpath()
  native_queued_spin_unlock()           __pv_queued_spin_unlock()

Finish the pv_wait implementation, but pv_kick needs the SBI
definition of the next patches.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/Kbuild               |  1 -
 arch/riscv/include/asm/paravirt.h           | 20 +++++++++
 arch/riscv/include/asm/qspinlock.h          | 29 ++++++++++++
 arch/riscv/include/asm/qspinlock_paravirt.h |  7 +++
 arch/riscv/include/asm/spinlock.h           |  2 +-
 arch/riscv/kernel/paravirt.c                | 50 +++++++++++++++++++++
 arch/riscv/kernel/setup.c                   |  3 ++
 7 files changed, 110 insertions(+), 2 deletions(-)
 create mode 100644 arch/riscv/include/asm/qspinlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h

diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index a0dc85e4a754..b89cb3b73c13 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -7,6 +7,5 @@ generic-y += parport.h
 generic-y += spinlock_types.h
 generic-y += qrwlock.h
 generic-y += qrwlock_types.h
-generic-y += qspinlock.h
 generic-y += user.h
 generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/paravirt.h b/arch/riscv/include/asm/paravirt.h
index 10ba3d6bae4f..ed7eebbedae8 100644
--- a/arch/riscv/include/asm/paravirt.h
+++ b/arch/riscv/include/asm/paravirt.h
@@ -26,4 +26,24 @@ int __init pv_time_init(void);
 
 #endif // CONFIG_PARAVIRT
 
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+
+void pv_wait(u8 *ptr, u8 val);
+void pv_kick(int cpu);
+
+void dummy_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+void dummy_queued_spin_unlock(struct qspinlock *lock);
+
+DECLARE_STATIC_CALL(pv_queued_spin_lock_slowpath, dummy_queued_spin_lock_slowpath);
+DECLARE_STATIC_CALL(pv_queued_spin_unlock, dummy_queued_spin_unlock);
+
+void __init pv_qspinlock_init(void);
+
+static inline bool pv_is_native_spin_unlock(void)
+{
+	return false;
+}
+
+#endif /* CONFIG_PARAVIRT_SPINLOCKS */
+
 #endif
diff --git a/arch/riscv/include/asm/qspinlock.h b/arch/riscv/include/asm/qspinlock.h
new file mode 100644
index 000000000000..003e9560a0d1
--- /dev/null
+++ b/arch/riscv/include/asm/qspinlock.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QSPINLOCK_H
+#define _ASM_RISCV_QSPINLOCK_H
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#include <asm/paravirt.h>
+
+/* How long a lock should spin before we consider blocking */
+#define SPIN_THRESHOLD		(1 << 15)
+
+void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+void __pv_init_lock_hash(void);
+void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+
+static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
+{
+	static_call(pv_queued_spin_lock_slowpath)(lock, val);
+}
+
+#define queued_spin_unlock	queued_spin_unlock
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+	static_call(pv_queued_spin_unlock)(lock);
+}
+#endif /* CONFIG_PARAVIRT_SPINLOCKS */
+
+#include <asm-generic/qspinlock.h>
+
+#endif /* _ASM_RISCV_QSPINLOCK_H */
diff --git a/arch/riscv/include/asm/qspinlock_paravirt.h b/arch/riscv/include/asm/qspinlock_paravirt.h
new file mode 100644
index 000000000000..ff52b41d8288
--- /dev/null
+++ b/arch/riscv/include/asm/qspinlock_paravirt.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QSPINLOCK_PARAVIRT_H
+#define _ASM_RISCV_QSPINLOCK_PARAVIRT_H
+
+void __pv_queued_spin_unlock(struct qspinlock *lock);
+
+#endif /* _ASM_RISCV_QSPINLOCK_PARAVIRT_H */
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index 13f3e14500c0..a8ba39e5f8dd 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -39,7 +39,7 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
 #undef arch_spin_trylock
 #undef arch_spin_unlock
 
-#include <asm-generic/qspinlock.h>
+#include <asm/qspinlock.h>
 #include <asm/hwcap.h>
 
 #undef arch_spin_is_locked
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index 35816fc10470..1bacb2cf3872 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -130,3 +130,53 @@ int __init pv_time_init(void)
 
 	return 0;
 }
+
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+#include <asm/qspinlock_paravirt.h>
+
+void pv_kick(int cpu)
+{
+	return;
+}
+
+void pv_wait(u8 *ptr, u8 val)
+{
+	unsigned long flags;
+
+	if (in_nmi())
+		return;
+
+	local_irq_save(flags);
+	if (READ_ONCE(*ptr) != val)
+		goto out;
+
+	/* wait_for_interrupt(); */
+out:
+	local_irq_restore(flags);
+}
+
+static void native_queued_spin_unlock(struct qspinlock *lock)
+{
+	smp_store_release(&lock->locked, 0);
+}
+
+DEFINE_STATIC_CALL(pv_queued_spin_lock_slowpath, native_queued_spin_lock_slowpath);
+DEFINE_STATIC_CALL(pv_queued_spin_unlock, native_queued_spin_unlock);
+EXPORT_SYMBOL(__SCK__pv_queued_spin_lock_slowpath);
+EXPORT_SYMBOL(__SCK__pv_queued_spin_unlock);
+
+void __init pv_qspinlock_init(void)
+{
+	if (num_possible_cpus() == 1)
+		return;
+
+	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)
+		return;
+
+	pr_info("PV qspinlocks enabled\n");
+	__pv_init_lock_hash();
+
+	static_call_update(pv_queued_spin_lock_slowpath, __pv_queued_spin_lock_slowpath);
+	static_call_update(pv_queued_spin_unlock, __pv_queued_spin_unlock);
+}
+#endif
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index def89fd8ea55..40f5b9402562 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -329,6 +329,9 @@ void __init setup_arch(char **cmdline_p)
 
 void __init arch_cpu_finalize_init(void)
 {
+#ifdef CONFIG_PARAVIRT_SPINLOCKS
+	pv_qspinlock_init();
+#endif
 	virt_spin_lock_init();
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 10/19] RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add the files functions needed to support the SBI PVLOCK (paravirt
qspinlock kick_cpu) extension. This is a preparation for the next
core implementation of kick_cpu.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/kvm_vcpu_sbi.h |  1 +
 arch/riscv/include/uapi/asm/kvm.h     |  1 +
 arch/riscv/kvm/Makefile               |  1 +
 arch/riscv/kvm/vcpu_sbi.c             |  4 +++
 arch/riscv/kvm/vcpu_sbi_pvlock.c      | 38 +++++++++++++++++++++++++++
 5 files changed, 45 insertions(+)
 create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index cdcf0ff07be7..7b4d60b54d7e 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -71,6 +71,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst;
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm;
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
+extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pvlock;
 
 #ifdef CONFIG_RISCV_PMU_SBI
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index 930fdc4101cd..e2100f994854 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -141,6 +141,7 @@ enum KVM_RISCV_SBI_EXT_ID {
 	KVM_RISCV_SBI_EXT_PMU,
 	KVM_RISCV_SBI_EXT_EXPERIMENTAL,
 	KVM_RISCV_SBI_EXT_VENDOR,
+	KVM_RISCV_SBI_EXT_PVLOCK,
 	KVM_RISCV_SBI_EXT_MAX,
 };
 
diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index fee0671e2dc1..c704da7b0a42 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -25,6 +25,7 @@ kvm-$(CONFIG_RISCV_SBI_V01) += vcpu_sbi_v01.o
 kvm-y += vcpu_sbi_base.o
 kvm-y += vcpu_sbi_replace.o
 kvm-y += vcpu_sbi_hsm.o
+kvm-y += vcpu_sbi_pvlock.o
 kvm-y += vcpu_timer.o
 kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
 kvm-y += aia.o
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index 7b46e04fb667..ea225d48edb2 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -74,6 +74,10 @@ static const struct kvm_riscv_sbi_extension_entry sbi_ext[] = {
 		.ext_idx = KVM_RISCV_SBI_EXT_VENDOR,
 		.ext_ptr = &vcpu_sbi_ext_vendor,
 	},
+	{
+		.ext_idx = KVM_RISCV_SBI_EXT_PVLOCK,
+		.ext_ptr = &vcpu_sbi_ext_pvlock,
+	},
 };
 
 void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run)
diff --git a/arch/riscv/kvm/vcpu_sbi_pvlock.c b/arch/riscv/kvm/vcpu_sbi_pvlock.c
new file mode 100644
index 000000000000..544a456c5041
--- /dev/null
+++ b/arch/riscv/kvm/vcpu_sbi_pvlock.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c), 2023 Alibaba Cloud
+ *
+ * Authors:
+ *     Guo Ren <guoren@linux.alibaba.com>
+ */
+
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/kvm_host.h>
+#include <asm/sbi.h>
+#include <asm/kvm_vcpu_sbi.h>
+
+static int kvm_sbi_ext_pvlock_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				      struct kvm_vcpu_sbi_return *retdata)
+{
+	int ret = 0;
+	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+	unsigned long funcid = cp->a6;
+
+	switch (funcid) {
+	case SBI_EXT_PVLOCK_KICK_CPU:
+		break;
+	default:
+		ret = SBI_ERR_NOT_SUPPORTED;
+	}
+
+	retdata->err_val = ret;
+
+	return 0;
+}
+
+const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pvlock = {
+	.extid_start = SBI_EXT_PVLOCK,
+	.extid_end = SBI_EXT_PVLOCK,
+	.handler = kvm_sbi_ext_pvlock_handler,
+};
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 10/19] RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add the files functions needed to support the SBI PVLOCK (paravirt
qspinlock kick_cpu) extension. This is a preparation for the next
core implementation of kick_cpu.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/kvm_vcpu_sbi.h |  1 +
 arch/riscv/include/uapi/asm/kvm.h     |  1 +
 arch/riscv/kvm/Makefile               |  1 +
 arch/riscv/kvm/vcpu_sbi.c             |  4 +++
 arch/riscv/kvm/vcpu_sbi_pvlock.c      | 38 +++++++++++++++++++++++++++
 5 files changed, 45 insertions(+)
 create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c

diff --git a/arch/riscv/include/asm/kvm_vcpu_sbi.h b/arch/riscv/include/asm/kvm_vcpu_sbi.h
index cdcf0ff07be7..7b4d60b54d7e 100644
--- a/arch/riscv/include/asm/kvm_vcpu_sbi.h
+++ b/arch/riscv/include/asm/kvm_vcpu_sbi.h
@@ -71,6 +71,7 @@ extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_srst;
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_hsm;
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_experimental;
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_vendor;
+extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pvlock;
 
 #ifdef CONFIG_RISCV_PMU_SBI
 extern const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pmu;
diff --git a/arch/riscv/include/uapi/asm/kvm.h b/arch/riscv/include/uapi/asm/kvm.h
index 930fdc4101cd..e2100f994854 100644
--- a/arch/riscv/include/uapi/asm/kvm.h
+++ b/arch/riscv/include/uapi/asm/kvm.h
@@ -141,6 +141,7 @@ enum KVM_RISCV_SBI_EXT_ID {
 	KVM_RISCV_SBI_EXT_PMU,
 	KVM_RISCV_SBI_EXT_EXPERIMENTAL,
 	KVM_RISCV_SBI_EXT_VENDOR,
+	KVM_RISCV_SBI_EXT_PVLOCK,
 	KVM_RISCV_SBI_EXT_MAX,
 };
 
diff --git a/arch/riscv/kvm/Makefile b/arch/riscv/kvm/Makefile
index fee0671e2dc1..c704da7b0a42 100644
--- a/arch/riscv/kvm/Makefile
+++ b/arch/riscv/kvm/Makefile
@@ -25,6 +25,7 @@ kvm-$(CONFIG_RISCV_SBI_V01) += vcpu_sbi_v01.o
 kvm-y += vcpu_sbi_base.o
 kvm-y += vcpu_sbi_replace.o
 kvm-y += vcpu_sbi_hsm.o
+kvm-y += vcpu_sbi_pvlock.o
 kvm-y += vcpu_timer.o
 kvm-$(CONFIG_RISCV_PMU_SBI) += vcpu_pmu.o vcpu_sbi_pmu.o
 kvm-y += aia.o
diff --git a/arch/riscv/kvm/vcpu_sbi.c b/arch/riscv/kvm/vcpu_sbi.c
index 7b46e04fb667..ea225d48edb2 100644
--- a/arch/riscv/kvm/vcpu_sbi.c
+++ b/arch/riscv/kvm/vcpu_sbi.c
@@ -74,6 +74,10 @@ static const struct kvm_riscv_sbi_extension_entry sbi_ext[] = {
 		.ext_idx = KVM_RISCV_SBI_EXT_VENDOR,
 		.ext_ptr = &vcpu_sbi_ext_vendor,
 	},
+	{
+		.ext_idx = KVM_RISCV_SBI_EXT_PVLOCK,
+		.ext_ptr = &vcpu_sbi_ext_pvlock,
+	},
 };
 
 void kvm_riscv_vcpu_sbi_forward(struct kvm_vcpu *vcpu, struct kvm_run *run)
diff --git a/arch/riscv/kvm/vcpu_sbi_pvlock.c b/arch/riscv/kvm/vcpu_sbi_pvlock.c
new file mode 100644
index 000000000000..544a456c5041
--- /dev/null
+++ b/arch/riscv/kvm/vcpu_sbi_pvlock.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c), 2023 Alibaba Cloud
+ *
+ * Authors:
+ *     Guo Ren <guoren@linux.alibaba.com>
+ */
+
+#include <linux/errno.h>
+#include <linux/err.h>
+#include <linux/kvm_host.h>
+#include <asm/sbi.h>
+#include <asm/kvm_vcpu_sbi.h>
+
+static int kvm_sbi_ext_pvlock_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				      struct kvm_vcpu_sbi_return *retdata)
+{
+	int ret = 0;
+	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+	unsigned long funcid = cp->a6;
+
+	switch (funcid) {
+	case SBI_EXT_PVLOCK_KICK_CPU:
+		break;
+	default:
+		ret = SBI_ERR_NOT_SUPPORTED;
+	}
+
+	retdata->err_val = ret;
+
+	return 0;
+}
+
+const struct kvm_vcpu_sbi_extension vcpu_sbi_ext_pvlock = {
+	.extid_start = SBI_EXT_PVLOCK,
+	.extid_end = SBI_EXT_PVLOCK,
+	.handler = kvm_sbi_ext_pvlock_handler,
+};
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 11/19] RISC-V: paravirt: pvqspinlock: KVM: Implement kvm_sbi_ext_pvlock_kick_cpu()
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

We only need to call the kvm_vcpu_kick() and bring target_vcpu
from the halt state. No irq raised, no other request, just a pure
vcpu_kick.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/kvm/vcpu_sbi_pvlock.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/riscv/kvm/vcpu_sbi_pvlock.c b/arch/riscv/kvm/vcpu_sbi_pvlock.c
index 544a456c5041..914fc58aedfe 100644
--- a/arch/riscv/kvm/vcpu_sbi_pvlock.c
+++ b/arch/riscv/kvm/vcpu_sbi_pvlock.c
@@ -12,6 +12,24 @@
 #include <asm/sbi.h>
 #include <asm/kvm_vcpu_sbi.h>
 
+static int kvm_sbi_ext_pvlock_kick_cpu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_vcpu *target;
+
+	target = kvm_get_vcpu_by_id(kvm, cp->a0);
+	if (!target)
+		return SBI_ERR_INVALID_PARAM;
+
+	kvm_vcpu_kick(target);
+
+	if (READ_ONCE(target->ready))
+		kvm_vcpu_yield_to(target);
+
+	return SBI_SUCCESS;
+}
+
 static int kvm_sbi_ext_pvlock_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 				      struct kvm_vcpu_sbi_return *retdata)
 {
@@ -21,6 +39,7 @@ static int kvm_sbi_ext_pvlock_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 
 	switch (funcid) {
 	case SBI_EXT_PVLOCK_KICK_CPU:
+		ret = kvm_sbi_ext_pvlock_kick_cpu(vcpu);
 		break;
 	default:
 		ret = SBI_ERR_NOT_SUPPORTED;
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 11/19] RISC-V: paravirt: pvqspinlock: KVM: Implement kvm_sbi_ext_pvlock_kick_cpu()
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

We only need to call the kvm_vcpu_kick() and bring target_vcpu
from the halt state. No irq raised, no other request, just a pure
vcpu_kick.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/kvm/vcpu_sbi_pvlock.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/riscv/kvm/vcpu_sbi_pvlock.c b/arch/riscv/kvm/vcpu_sbi_pvlock.c
index 544a456c5041..914fc58aedfe 100644
--- a/arch/riscv/kvm/vcpu_sbi_pvlock.c
+++ b/arch/riscv/kvm/vcpu_sbi_pvlock.c
@@ -12,6 +12,24 @@
 #include <asm/sbi.h>
 #include <asm/kvm_vcpu_sbi.h>
 
+static int kvm_sbi_ext_pvlock_kick_cpu(struct kvm_vcpu *vcpu)
+{
+	struct kvm_cpu_context *cp = &vcpu->arch.guest_context;
+	struct kvm *kvm = vcpu->kvm;
+	struct kvm_vcpu *target;
+
+	target = kvm_get_vcpu_by_id(kvm, cp->a0);
+	if (!target)
+		return SBI_ERR_INVALID_PARAM;
+
+	kvm_vcpu_kick(target);
+
+	if (READ_ONCE(target->ready))
+		kvm_vcpu_yield_to(target);
+
+	return SBI_SUCCESS;
+}
+
 static int kvm_sbi_ext_pvlock_handler(struct kvm_vcpu *vcpu, struct kvm_run *run,
 				      struct kvm_vcpu_sbi_return *retdata)
 {
@@ -21,6 +39,7 @@ static int kvm_sbi_ext_pvlock_handler(struct kvm_vcpu *vcpu, struct kvm_run *run
 
 	switch (funcid) {
 	case SBI_EXT_PVLOCK_KICK_CPU:
+		ret = kvm_sbi_ext_pvlock_kick_cpu(vcpu);
 		break;
 	default:
 		ret = SBI_ERR_NOT_SUPPORTED;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 12/19] RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Disables the qspinlock slow path using PV optimizations which
allow the hypervisor to 'idle' the guest on lock contention.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  2 +-
 arch/riscv/kernel/paravirt.c                    | 13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index de6b7ee752cd..1a8878f6bfbd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3820,7 +3820,7 @@
 			as generic guest with no PV drivers. Currently support
 			XEN HVM, KVM, HYPER_V and VMWARE guest.
 
-	nopvspin	[X86,XEN,KVM]
+	nopvspin	[X86,XEN,KVM,RISC-V]
 			Disables the qspinlock slow path using PV optimizations
 			which allow the hypervisor to 'idle' the guest on lock
 			contention.
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index 1bacb2cf3872..b55c3d3c0c17 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -165,8 +165,21 @@ DEFINE_STATIC_CALL(pv_queued_spin_unlock, native_queued_spin_unlock);
 EXPORT_SYMBOL(__SCK__pv_queued_spin_lock_slowpath);
 EXPORT_SYMBOL(__SCK__pv_queued_spin_unlock);
 
+static bool nopvspin;
+static __init int parse_nopvspin(char *arg)
+{
+       nopvspin = true;
+       return 0;
+}
+early_param("nopvspin", parse_nopvspin);
+
 void __init pv_qspinlock_init(void)
 {
+	if (nopvspin) {
+		pr_info("PV qspinlocks disabled\n");
+		return;
+	}
+
 	if (num_possible_cpus() == 1)
 		return;
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 12/19] RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Disables the qspinlock slow path using PV optimizations which
allow the hypervisor to 'idle' the guest on lock contention.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  2 +-
 arch/riscv/kernel/paravirt.c                    | 13 +++++++++++++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index de6b7ee752cd..1a8878f6bfbd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3820,7 +3820,7 @@
 			as generic guest with no PV drivers. Currently support
 			XEN HVM, KVM, HYPER_V and VMWARE guest.
 
-	nopvspin	[X86,XEN,KVM]
+	nopvspin	[X86,XEN,KVM,RISC-V]
 			Disables the qspinlock slow path using PV optimizations
 			which allow the hypervisor to 'idle' the guest on lock
 			contention.
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index 1bacb2cf3872..b55c3d3c0c17 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -165,8 +165,21 @@ DEFINE_STATIC_CALL(pv_queued_spin_unlock, native_queued_spin_unlock);
 EXPORT_SYMBOL(__SCK__pv_queued_spin_lock_slowpath);
 EXPORT_SYMBOL(__SCK__pv_queued_spin_unlock);
 
+static bool nopvspin;
+static __init int parse_nopvspin(char *arg)
+{
+       nopvspin = true;
+       return 0;
+}
+early_param("nopvspin", parse_nopvspin);
+
 void __init pv_qspinlock_init(void)
 {
+	if (nopvspin) {
+		pr_info("PV qspinlocks disabled\n");
+		return;
+	}
+
 	if (num_possible_cpus() == 1)
 		return;
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 13/19] RISC-V: paravirt: pvqspinlock: Remove unnecessary definitions of cmpxchg & xchg
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The custom xchg/cmpxchg_release macro definitions have no
difference from the common code from the binary view. The
xchg32/64 macro definitions have been abandoned in Linux. Thus,
remove all of them.

This is a preparation for the next cmpxchg_small & xchg8 patches.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 93 --------------------------------
 1 file changed, 93 deletions(-)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index d12231d752a4..3ab37215ed86 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -103,41 +103,6 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					    _x_, sizeof(*(ptr)));	\
 })
 
-#define __xchg_release(ptr, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.w %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.d %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
-
-#define arch_xchg_release(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_release((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
-})
-
 #define __arch_xchg(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -170,18 +135,6 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	(__typeof__(*(ptr))) __arch_xchg((ptr), _x_, sizeof(*(ptr)));	\
 })
 
-#define xchg32(ptr, x)							\
-({									\
-	BUILD_BUG_ON(sizeof(*(ptr)) != 4);				\
-	arch_xchg((ptr), (x));						\
-})
-
-#define xchg64(ptr, x)							\
-({									\
-	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
-	arch_xchg((ptr), (x));						\
-})
-
 /*
  * Atomic compare and exchange.  Compare OLD with MEM, if identical,
  * store NEW in MEM.  Return the initial value in MEM.  Success is
@@ -277,52 +230,6 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					_o_, _n_, sizeof(*(ptr)));	\
 })
 
-#define __cmpxchg_release(ptr, old, new, size)				\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
-
-#define arch_cmpxchg_release(ptr, o, n)					\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
-})
-
 #define __cmpxchg(ptr, old, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 13/19] RISC-V: paravirt: pvqspinlock: Remove unnecessary definitions of cmpxchg & xchg
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The custom xchg/cmpxchg_release macro definitions have no
difference from the common code from the binary view. The
xchg32/64 macro definitions have been abandoned in Linux. Thus,
remove all of them.

This is a preparation for the next cmpxchg_small & xchg8 patches.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 93 --------------------------------
 1 file changed, 93 deletions(-)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index d12231d752a4..3ab37215ed86 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -103,41 +103,6 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					    _x_, sizeof(*(ptr)));	\
 })
 
-#define __xchg_release(ptr, new, size)					\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(new) __new = (new);					\
-	__typeof__(*(ptr)) __ret;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.w %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"	amoswap.d %0, %2, %1\n"			\
-			: "=r" (__ret), "+A" (*__ptr)			\
-			: "r" (__new)					\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
-
-#define arch_xchg_release(ptr, x)					\
-({									\
-	__typeof__(*(ptr)) _x_ = (x);					\
-	(__typeof__(*(ptr))) __xchg_release((ptr),			\
-					    _x_, sizeof(*(ptr)));	\
-})
-
 #define __arch_xchg(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -170,18 +135,6 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	(__typeof__(*(ptr))) __arch_xchg((ptr), _x_, sizeof(*(ptr)));	\
 })
 
-#define xchg32(ptr, x)							\
-({									\
-	BUILD_BUG_ON(sizeof(*(ptr)) != 4);				\
-	arch_xchg((ptr), (x));						\
-})
-
-#define xchg64(ptr, x)							\
-({									\
-	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
-	arch_xchg((ptr), (x));						\
-})
-
 /*
  * Atomic compare and exchange.  Compare OLD with MEM, if identical,
  * store NEW in MEM.  Return the initial value in MEM.  Success is
@@ -277,52 +230,6 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					_o_, _n_, sizeof(*(ptr)));	\
 })
 
-#define __cmpxchg_release(ptr, old, new, size)				\
-({									\
-	__typeof__(ptr) __ptr = (ptr);					\
-	__typeof__(*(ptr)) __old = (old);				\
-	__typeof__(*(ptr)) __new = (new);				\
-	__typeof__(*(ptr)) __ret;					\
-	register unsigned int __rc;					\
-	switch (size) {							\
-	case 4:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.w %0, %2\n"				\
-			"	bne  %0, %z3, 1f\n"			\
-			"	sc.w %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" ((long)__old), "rJ" (__new)		\
-			: "memory");					\
-		break;							\
-	case 8:								\
-		__asm__ __volatile__ (					\
-			RISCV_RELEASE_BARRIER				\
-			"0:	lr.d %0, %2\n"				\
-			"	bne %0, %z3, 1f\n"			\
-			"	sc.d %1, %z4, %2\n"			\
-			"	bnez %1, 0b\n"				\
-			"1:\n"						\
-			: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr)	\
-			: "rJ" (__old), "rJ" (__new)			\
-			: "memory");					\
-		break;							\
-	default:							\
-		BUILD_BUG();						\
-	}								\
-	__ret;								\
-})
-
-#define arch_cmpxchg_release(ptr, o, n)					\
-({									\
-	__typeof__(*(ptr)) _o_ = (o);					\
-	__typeof__(*(ptr)) _n_ = (n);					\
-	(__typeof__(*(ptr))) __cmpxchg_release((ptr),			\
-					_o_, _n_, sizeof(*(ptr)));	\
-})
-
 #define __cmpxchg(ptr, old, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 14/19] RISC-V: paravirt: pvqspinlock: Add xchg8 & cmpxchg_small support
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The pvqspinlock needs additional sub-word atomic operations. Here
is the list:
 - xchg8 (RCsc)
 - cmpxchg8/16_relaxed
 - cmpxchg8/16_release (Rcpc)
 - cmpxchg8_acquire (RCpc)
 - cmpxchg8 (RCsc)

Although paravirt qspinlock doesn't have the native_qspinlock
fairness, giving a strong forward progress guarantee to these
atomic semantics could prevent unnecessary tries, which would
cause cache line bouncing.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 177 +++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 3ab37215ed86..2fd797c04e7a 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -103,12 +103,37 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					    _x_, sizeof(*(ptr)));	\
 })
 
+static inline ulong __xchg8(ulong new, void *ptr)
+{
+	ulong ret, tmp;
+	ulong shif = ((ulong)ptr & 3) * 8;
+	ulong mask = 0xff << shif;
+	ulong *__ptr = (ulong *)((ulong)ptr & ~3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	or   %1, %1, %z4\n"
+		"	sc.w.rl %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+			"fence w, rw\n"
+		: "=&r" (ret), "=&r" (tmp), "+A" (*__ptr)
+		: "rJ" (~mask), "rJ" (new << shif)
+		: "memory");
+
+	return (ulong)((ret & mask) >> shif);
+}
+
 #define __arch_xchg(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 1:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__xchg8((ulong)__new, __ptr);			\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w.aqrl %0, %2, %1\n"		\
@@ -140,6 +165,51 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
  * store NEW in MEM.  Return the initial value in MEM.  Success is
  * indicated by comparing RETURN with OLD.
  */
+static inline ulong __cmpxchg_small_relaxed(void *ptr, ulong old,
+					    ulong new, ulong size)
+{
+	ulong shift;
+	ulong ret, mask, temp;
+	volatile ulong *ptr32;
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (ulong)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile ulong *)((ulong)ptr & ~0x3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	bne  %1, %z5, 1f\n"
+		"	and  %1, %0, %z4\n"
+		"	or   %1, %1, %z6\n"
+		"	sc.w %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		"1:\n"
+		: "=&r" (ret), "=&r" (temp), "+A" (*ptr32)
+		: "rJ" (mask), "rJ" (~mask), "rJ" (old), "rJ" (new)
+		: "memory");
+
+	return (ret & mask) >> shift;
+}
+
 #define __cmpxchg_relaxed(ptr, old, new, size)				\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -148,6 +218,11 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__cmpxchg_small_relaxed(__ptr, (ulong)__old,	\
+					(ulong)__new, (ulong)size);	\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
@@ -184,6 +259,52 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					_o_, _n_, sizeof(*(ptr)));	\
 })
 
+static inline ulong __cmpxchg_small_acquire(void *ptr, ulong old,
+					    ulong new, ulong size)
+{
+	ulong shift;
+	ulong ret, mask, temp;
+	volatile ulong *ptr32;
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (ulong)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile ulong *)((ulong)ptr & ~0x3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	bne  %1, %z5, 1f\n"
+		"	and  %1, %0, %z4\n"
+		"	or   %1, %1, %z6\n"
+		"	sc.w %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		RISCV_ACQUIRE_BARRIER
+		"1:\n"
+		: "=&r" (ret), "=&r" (temp), "+A" (*ptr32)
+		: "rJ" (mask), "rJ" (~mask), "rJ" (old), "rJ" (new)
+		: "memory");
+
+	return (ret & mask) >> shift;
+}
+
 #define __cmpxchg_acquire(ptr, old, new, size)				\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -192,6 +313,12 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__cmpxchg_small_acquire(__ptr, (ulong)__old,	\
+					(ulong)__new, (ulong)size);	\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
@@ -230,6 +357,51 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					_o_, _n_, sizeof(*(ptr)));	\
 })
 
+static inline ulong __cmpxchg_small(void *ptr, ulong old,
+				    ulong new, ulong size)
+{
+	ulong shift;
+	ulong ret, mask, temp;
+	volatile ulong *ptr32;
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (ulong)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile ulong *)((ulong)ptr & ~0x3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	bne  %1, %z5, 1f\n"
+		"	and  %1, %0, %z4\n"
+		"	or   %1, %1, %z6\n"
+		"	sc.w.rl %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		"	fence w, rw\n"
+		"1:\n"
+		: "=&r" (ret), "=&r" (temp), "+A" (*ptr32)
+		: "rJ" (mask), "rJ" (~mask), "rJ" (old), "rJ" (new)
+		: "memory");
+
+	return (ret & mask) >> shift;
+}
 #define __cmpxchg(ptr, old, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -238,6 +410,11 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__cmpxchg_small(__ptr, (ulong)__old,		\
+					(ulong)__new, (ulong)size);	\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 14/19] RISC-V: paravirt: pvqspinlock: Add xchg8 & cmpxchg_small support
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The pvqspinlock needs additional sub-word atomic operations. Here
is the list:
 - xchg8 (RCsc)
 - cmpxchg8/16_relaxed
 - cmpxchg8/16_release (Rcpc)
 - cmpxchg8_acquire (RCpc)
 - cmpxchg8 (RCsc)

Although paravirt qspinlock doesn't have the native_qspinlock
fairness, giving a strong forward progress guarantee to these
atomic semantics could prevent unnecessary tries, which would
cause cache line bouncing.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 177 +++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 3ab37215ed86..2fd797c04e7a 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -103,12 +103,37 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					    _x_, sizeof(*(ptr)));	\
 })
 
+static inline ulong __xchg8(ulong new, void *ptr)
+{
+	ulong ret, tmp;
+	ulong shif = ((ulong)ptr & 3) * 8;
+	ulong mask = 0xff << shif;
+	ulong *__ptr = (ulong *)((ulong)ptr & ~3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	or   %1, %1, %z4\n"
+		"	sc.w.rl %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+			"fence w, rw\n"
+		: "=&r" (ret), "=&r" (tmp), "+A" (*__ptr)
+		: "rJ" (~mask), "rJ" (new << shif)
+		: "memory");
+
+	return (ulong)((ret & mask) >> shif);
+}
+
 #define __arch_xchg(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 1:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__xchg8((ulong)__new, __ptr);			\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w.aqrl %0, %2, %1\n"		\
@@ -140,6 +165,51 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
  * store NEW in MEM.  Return the initial value in MEM.  Success is
  * indicated by comparing RETURN with OLD.
  */
+static inline ulong __cmpxchg_small_relaxed(void *ptr, ulong old,
+					    ulong new, ulong size)
+{
+	ulong shift;
+	ulong ret, mask, temp;
+	volatile ulong *ptr32;
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (ulong)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile ulong *)((ulong)ptr & ~0x3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	bne  %1, %z5, 1f\n"
+		"	and  %1, %0, %z4\n"
+		"	or   %1, %1, %z6\n"
+		"	sc.w %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		"1:\n"
+		: "=&r" (ret), "=&r" (temp), "+A" (*ptr32)
+		: "rJ" (mask), "rJ" (~mask), "rJ" (old), "rJ" (new)
+		: "memory");
+
+	return (ret & mask) >> shift;
+}
+
 #define __cmpxchg_relaxed(ptr, old, new, size)				\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -148,6 +218,11 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__cmpxchg_small_relaxed(__ptr, (ulong)__old,	\
+					(ulong)__new, (ulong)size);	\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
@@ -184,6 +259,52 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					_o_, _n_, sizeof(*(ptr)));	\
 })
 
+static inline ulong __cmpxchg_small_acquire(void *ptr, ulong old,
+					    ulong new, ulong size)
+{
+	ulong shift;
+	ulong ret, mask, temp;
+	volatile ulong *ptr32;
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (ulong)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile ulong *)((ulong)ptr & ~0x3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	bne  %1, %z5, 1f\n"
+		"	and  %1, %0, %z4\n"
+		"	or   %1, %1, %z6\n"
+		"	sc.w %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		RISCV_ACQUIRE_BARRIER
+		"1:\n"
+		: "=&r" (ret), "=&r" (temp), "+A" (*ptr32)
+		: "rJ" (mask), "rJ" (~mask), "rJ" (old), "rJ" (new)
+		: "memory");
+
+	return (ret & mask) >> shift;
+}
+
 #define __cmpxchg_acquire(ptr, old, new, size)				\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -192,6 +313,12 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__cmpxchg_small_acquire(__ptr, (ulong)__old,	\
+					(ulong)__new, (ulong)size);	\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
@@ -230,6 +357,51 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 					_o_, _n_, sizeof(*(ptr)));	\
 })
 
+static inline ulong __cmpxchg_small(void *ptr, ulong old,
+				    ulong new, ulong size)
+{
+	ulong shift;
+	ulong ret, mask, temp;
+	volatile ulong *ptr32;
+
+	/* Mask inputs to the correct size. */
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);
+	old &= mask;
+	new &= mask;
+
+	/*
+	 * Calculate a shift & mask that correspond to the value we wish to
+	 * compare & exchange within the naturally aligned 4 byte integer
+	 * that includes it.
+	 */
+	shift = (ulong)ptr & 0x3;
+	shift *= BITS_PER_BYTE;
+	old <<= shift;
+	new <<= shift;
+	mask <<= shift;
+
+	/*
+	 * Calculate a pointer to the naturally aligned 4 byte integer that
+	 * includes our byte of interest, and load its value.
+	 */
+	ptr32 = (volatile ulong *)((ulong)ptr & ~0x3);
+
+	__asm__ __volatile__ (
+		"0:	lr.w %0, %2\n"
+		"	and  %1, %0, %z3\n"
+		"	bne  %1, %z5, 1f\n"
+		"	and  %1, %0, %z4\n"
+		"	or   %1, %1, %z6\n"
+		"	sc.w.rl %1, %1, %2\n"
+		"	bnez %1, 0b\n"
+		"	fence w, rw\n"
+		"1:\n"
+		: "=&r" (ret), "=&r" (temp), "+A" (*ptr32)
+		: "rJ" (mask), "rJ" (~mask), "rJ" (old), "rJ" (new)
+		: "memory");
+
+	return (ret & mask) >> shift;
+}
 #define __cmpxchg(ptr, old, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
@@ -238,6 +410,11 @@ static inline ulong __xchg16_relaxed(ulong new, void *ptr)
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+		__ret = (__typeof__(*(ptr)))				\
+			__cmpxchg_small(__ptr, (ulong)__old,		\
+					(ulong)__new, (ulong)size);	\
+		break;							\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 15/19] RISC-V: paravirt: pvqspinlock: Add SBI implementation
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Implement pv_kick with SBI implementation, and add SBI_EXT_PVLOCK
extension detection.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/sbi.h | 6 ++++++
 arch/riscv/kernel/paravirt.c | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index b7ced34b79a3..26b4ec039f32 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -31,6 +31,7 @@ enum sbi_ext_id {
 	SBI_EXT_SRST = 0x53525354,
 	SBI_EXT_PMU = 0x504D55,
 	SBI_EXT_STA = 0x535441,
+	SBI_EXT_PVLOCK = 0xAB0401,
 
 	/* Experimentals extensions must lie within this range */
 	SBI_EXT_EXPERIMENTAL_START = 0x08000000,
@@ -244,6 +245,11 @@ enum sbi_pmu_ctr_type {
 /* Flags defined for counter stop function */
 #define SBI_PMU_STOP_FLAG_RESET (1 << 0)
 
+/* SBI PVLOCK (kick cpu out of wfi) */
+enum sbi_ext_pvlock_fid {
+	SBI_EXT_PVLOCK_KICK_CPU = 0,
+};
+
 /* SBI STA (steal-time accounting) extension */
 enum sbi_ext_sta_fid {
 	SBI_EXT_STA_STEAL_TIME_SET_SHMEM = 0,
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index b55c3d3c0c17..564d64f11e4f 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -136,6 +136,8 @@ int __init pv_time_init(void)
 
 void pv_kick(int cpu)
 {
+	sbi_ecall(SBI_EXT_PVLOCK, SBI_EXT_PVLOCK_KICK_CPU,
+		  cpuid_to_hartid_map(cpu), 0, 0, 0, 0, 0);
 	return;
 }
 
@@ -150,7 +152,7 @@ void pv_wait(u8 *ptr, u8 val)
 	if (READ_ONCE(*ptr) != val)
 		goto out;
 
-	/* wait_for_interrupt(); */
+	wait_for_interrupt();
 out:
 	local_irq_restore(flags);
 }
@@ -186,6 +188,9 @@ void __init pv_qspinlock_init(void)
 	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)
 		return;
 
+	if (!sbi_probe_extension(SBI_EXT_PVLOCK))
+		return;
+
 	pr_info("PV qspinlocks enabled\n");
 	__pv_init_lock_hash();
 
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 15/19] RISC-V: paravirt: pvqspinlock: Add SBI implementation
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Implement pv_kick with SBI implementation, and add SBI_EXT_PVLOCK
extension detection.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/sbi.h | 6 ++++++
 arch/riscv/kernel/paravirt.c | 7 ++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h
index b7ced34b79a3..26b4ec039f32 100644
--- a/arch/riscv/include/asm/sbi.h
+++ b/arch/riscv/include/asm/sbi.h
@@ -31,6 +31,7 @@ enum sbi_ext_id {
 	SBI_EXT_SRST = 0x53525354,
 	SBI_EXT_PMU = 0x504D55,
 	SBI_EXT_STA = 0x535441,
+	SBI_EXT_PVLOCK = 0xAB0401,
 
 	/* Experimentals extensions must lie within this range */
 	SBI_EXT_EXPERIMENTAL_START = 0x08000000,
@@ -244,6 +245,11 @@ enum sbi_pmu_ctr_type {
 /* Flags defined for counter stop function */
 #define SBI_PMU_STOP_FLAG_RESET (1 << 0)
 
+/* SBI PVLOCK (kick cpu out of wfi) */
+enum sbi_ext_pvlock_fid {
+	SBI_EXT_PVLOCK_KICK_CPU = 0,
+};
+
 /* SBI STA (steal-time accounting) extension */
 enum sbi_ext_sta_fid {
 	SBI_EXT_STA_STEAL_TIME_SET_SHMEM = 0,
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index b55c3d3c0c17..564d64f11e4f 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -136,6 +136,8 @@ int __init pv_time_init(void)
 
 void pv_kick(int cpu)
 {
+	sbi_ecall(SBI_EXT_PVLOCK, SBI_EXT_PVLOCK_KICK_CPU,
+		  cpuid_to_hartid_map(cpu), 0, 0, 0, 0, 0);
 	return;
 }
 
@@ -150,7 +152,7 @@ void pv_wait(u8 *ptr, u8 val)
 	if (READ_ONCE(*ptr) != val)
 		goto out;
 
-	/* wait_for_interrupt(); */
+	wait_for_interrupt();
 out:
 	local_irq_restore(flags);
 }
@@ -186,6 +188,9 @@ void __init pv_qspinlock_init(void)
 	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)
 		return;
 
+	if (!sbi_probe_extension(SBI_EXT_PVLOCK))
+		return;
+
 	pr_info("PV qspinlocks enabled\n");
 	__pv_init_lock_hash();
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 16/19] RISC-V: paravirt: pvqspinlock: Add kconfig entry
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add kconfig entry for paravirt_spinlock, an unfair qspinlock
virtualization-friendly backend, by halting the virtual CPU rather
than spinning.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 42ae45c42b4d..13f345b54581 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -770,6 +770,7 @@ config RELOCATABLE
 config PARAVIRT
 	bool "Enable paravirtualization code"
 	depends on RISCV_SBI
+	select PARAVIRT_SPINLOCKS
 	default y
 	help
 	  This changes the kernel so it can modify itself when it is run
@@ -788,6 +789,17 @@ config PARAVIRT_TIME_ACCOUNTING
 
 	  If in doubt, say N here.
 
+config PARAVIRT_SPINLOCKS
+	bool "Paravirtualization layer for spinlocks"
+	depends on PARAVIRT && SMP
+	help
+	  Paravirtualized spinlocks allow a unfair qspinlock to replace the
+	  test-set kvm-guest virt spinlock implementation with something
+	  virtualization-friendly, for example, halt the virtual CPU rather
+	  than spinning.
+
+	  If you are unsure how to answer this question, answer Y.
+
 endmenu # "Kernel features"
 
 menu "Boot options"
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 16/19] RISC-V: paravirt: pvqspinlock: Add kconfig entry
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add kconfig entry for paravirt_spinlock, an unfair qspinlock
virtualization-friendly backend, by halting the virtual CPU rather
than spinning.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 42ae45c42b4d..13f345b54581 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -770,6 +770,7 @@ config RELOCATABLE
 config PARAVIRT
 	bool "Enable paravirtualization code"
 	depends on RISCV_SBI
+	select PARAVIRT_SPINLOCKS
 	default y
 	help
 	  This changes the kernel so it can modify itself when it is run
@@ -788,6 +789,17 @@ config PARAVIRT_TIME_ACCOUNTING
 
 	  If in doubt, say N here.
 
+config PARAVIRT_SPINLOCKS
+	bool "Paravirtualization layer for spinlocks"
+	depends on PARAVIRT && SMP
+	help
+	  Paravirtualized spinlocks allow a unfair qspinlock to replace the
+	  test-set kvm-guest virt spinlock implementation with something
+	  virtualization-friendly, for example, halt the virtual CPU rather
+	  than spinning.
+
+	  If you are unsure how to answer this question, answer Y.
+
 endmenu # "Kernel features"
 
 menu "Boot options"
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 17/19] RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:46   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add trace point for pv_kick/wait, here is the output:

 entries-in-buffer/entries-written: 33927/33927   #P:12

                                _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                              | / _---=> hardirq/softirq
                              || / _--=> preempt-depth
                              ||| / _-=> migrate-disable
                              |||| /     delay
           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
              | |         |   |||||     |         |
             sh-100     [001] d..2.    28.312294: pv_wait: cpu 1 out of wfi
         <idle>-0       [000] d.h4.    28.322030: pv_kick: cpu 0 kick target cpu 1
             sh-100     [001] d..2.    30.982631: pv_wait: cpu 1 out of wfi
         <idle>-0       [000] d.h4.    30.993289: pv_kick: cpu 0 kick target cpu 1
             sh-100     [002] d..2.    44.987573: pv_wait: cpu 2 out of wfi
         <idle>-0       [000] d.h4.    44.989000: pv_kick: cpu 0 kick target cpu 2
         <idle>-0       [003] d.s3.    51.593978: pv_kick: cpu 3 kick target cpu 4
      rcu_sched-15      [004] d..2.    51.595192: pv_wait: cpu 4 out of wfi
lock_torture_wr-115     [004] ...2.    52.656482: pv_kick: cpu 4 kick target cpu 2
lock_torture_wr-113     [002] d..2.    52.659146: pv_wait: cpu 2 out of wfi
lock_torture_wr-114     [008] d..2.    52.659507: pv_wait: cpu 8 out of wfi
lock_torture_wr-114     [008] d..2.    52.663503: pv_wait: cpu 8 out of wfi
lock_torture_wr-113     [002] ...2.    52.666128: pv_kick: cpu 2 kick target cpu 8
lock_torture_wr-114     [008] d..2.    52.667261: pv_wait: cpu 8 out of wfi
lock_torture_wr-114     [009] .n.2.    53.141515: pv_kick: cpu 9 kick target cpu 11
lock_torture_wr-113     [002] d..2.    53.143339: pv_wait: cpu 2 out of wfi
lock_torture_wr-116     [007] d..2.    53.143412: pv_wait: cpu 7 out of wfi
lock_torture_wr-118     [000] d..2.    53.143457: pv_wait: cpu 0 out of wfi
lock_torture_wr-115     [008] d..2.    53.143481: pv_wait: cpu 8 out of wfi
lock_torture_wr-117     [011] d..2.    53.143522: pv_wait: cpu 11 out of wfi
lock_torture_wr-117     [011] ...2.    53.143987: pv_kick: cpu 11 kick target cpu 8
lock_torture_wr-115     [008] ...2.    53.144269: pv_kick: cpu 8 kick target cpu 7

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/kernel/paravirt.c                  |  8 +++
 .../kernel/trace_events_filter_paravirt.h     | 60 +++++++++++++++++++
 2 files changed, 68 insertions(+)
 create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h

diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index 564d64f11e4f..cc80e968ab13 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -134,10 +134,16 @@ int __init pv_time_init(void)
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
 #include <asm/qspinlock_paravirt.h>
 
+#define CREATE_TRACE_POINTS
+#include "trace_events_filter_paravirt.h"
+
 void pv_kick(int cpu)
 {
 	sbi_ecall(SBI_EXT_PVLOCK, SBI_EXT_PVLOCK_KICK_CPU,
 		  cpuid_to_hartid_map(cpu), 0, 0, 0, 0, 0);
+
+	trace_pv_kick(smp_processor_id(), cpu);
+
 	return;
 }
 
@@ -153,6 +159,8 @@ void pv_wait(u8 *ptr, u8 val)
 		goto out;
 
 	wait_for_interrupt();
+
+	trace_pv_wait(smp_processor_id());
 out:
 	local_irq_restore(flags);
 }
diff --git a/arch/riscv/kernel/trace_events_filter_paravirt.h b/arch/riscv/kernel/trace_events_filter_paravirt.h
new file mode 100644
index 000000000000..9ff5aa451b12
--- /dev/null
+++ b/arch/riscv/kernel/trace_events_filter_paravirt.h
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c), 2023 Alibaba Cloud
+ * Authors:
+ *	Guo Ren <guoren@linux.alibaba.com>
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM paravirt
+
+#if !defined(_TRACE_PARAVIRT_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PARAVIRT_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(pv_kick,
+	TP_PROTO(int cpu, int target),
+	TP_ARGS(cpu, target),
+
+	TP_STRUCT__entry(
+		__field(int, cpu)
+		__field(int, target)
+	),
+
+	TP_fast_assign(
+		__entry->cpu = cpu;
+		__entry->target = target;
+	),
+
+	TP_printk("cpu %d kick target cpu %d",
+		__entry->cpu,
+		__entry->target
+	)
+);
+
+TRACE_EVENT(pv_wait,
+	TP_PROTO(int cpu),
+	TP_ARGS(cpu),
+
+	TP_STRUCT__entry(
+		__field(int, cpu)
+	),
+
+	TP_fast_assign(
+		__entry->cpu = cpu;
+	),
+
+	TP_printk("cpu %d out of wfi",
+		__entry->cpu
+	)
+);
+
+#endif /* _TRACE_PARAVIRT_H || TRACE_HEADER_MULTI_READ */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH ../../../arch/riscv/kernel/
+#define TRACE_INCLUDE_FILE trace_events_filter_paravirt
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 17/19] RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
@ 2023-08-02 16:46   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:46 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Add trace point for pv_kick/wait, here is the output:

 entries-in-buffer/entries-written: 33927/33927   #P:12

                                _-----=> irqs-off/BH-disabled
                               / _----=> need-resched
                              | / _---=> hardirq/softirq
                              || / _--=> preempt-depth
                              ||| / _-=> migrate-disable
                              |||| /     delay
           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
              | |         |   |||||     |         |
             sh-100     [001] d..2.    28.312294: pv_wait: cpu 1 out of wfi
         <idle>-0       [000] d.h4.    28.322030: pv_kick: cpu 0 kick target cpu 1
             sh-100     [001] d..2.    30.982631: pv_wait: cpu 1 out of wfi
         <idle>-0       [000] d.h4.    30.993289: pv_kick: cpu 0 kick target cpu 1
             sh-100     [002] d..2.    44.987573: pv_wait: cpu 2 out of wfi
         <idle>-0       [000] d.h4.    44.989000: pv_kick: cpu 0 kick target cpu 2
         <idle>-0       [003] d.s3.    51.593978: pv_kick: cpu 3 kick target cpu 4
      rcu_sched-15      [004] d..2.    51.595192: pv_wait: cpu 4 out of wfi
lock_torture_wr-115     [004] ...2.    52.656482: pv_kick: cpu 4 kick target cpu 2
lock_torture_wr-113     [002] d..2.    52.659146: pv_wait: cpu 2 out of wfi
lock_torture_wr-114     [008] d..2.    52.659507: pv_wait: cpu 8 out of wfi
lock_torture_wr-114     [008] d..2.    52.663503: pv_wait: cpu 8 out of wfi
lock_torture_wr-113     [002] ...2.    52.666128: pv_kick: cpu 2 kick target cpu 8
lock_torture_wr-114     [008] d..2.    52.667261: pv_wait: cpu 8 out of wfi
lock_torture_wr-114     [009] .n.2.    53.141515: pv_kick: cpu 9 kick target cpu 11
lock_torture_wr-113     [002] d..2.    53.143339: pv_wait: cpu 2 out of wfi
lock_torture_wr-116     [007] d..2.    53.143412: pv_wait: cpu 7 out of wfi
lock_torture_wr-118     [000] d..2.    53.143457: pv_wait: cpu 0 out of wfi
lock_torture_wr-115     [008] d..2.    53.143481: pv_wait: cpu 8 out of wfi
lock_torture_wr-117     [011] d..2.    53.143522: pv_wait: cpu 11 out of wfi
lock_torture_wr-117     [011] ...2.    53.143987: pv_kick: cpu 11 kick target cpu 8
lock_torture_wr-115     [008] ...2.    53.144269: pv_kick: cpu 8 kick target cpu 7

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/kernel/paravirt.c                  |  8 +++
 .../kernel/trace_events_filter_paravirt.h     | 60 +++++++++++++++++++
 2 files changed, 68 insertions(+)
 create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h

diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index 564d64f11e4f..cc80e968ab13 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -134,10 +134,16 @@ int __init pv_time_init(void)
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
 #include <asm/qspinlock_paravirt.h>
 
+#define CREATE_TRACE_POINTS
+#include "trace_events_filter_paravirt.h"
+
 void pv_kick(int cpu)
 {
 	sbi_ecall(SBI_EXT_PVLOCK, SBI_EXT_PVLOCK_KICK_CPU,
 		  cpuid_to_hartid_map(cpu), 0, 0, 0, 0, 0);
+
+	trace_pv_kick(smp_processor_id(), cpu);
+
 	return;
 }
 
@@ -153,6 +159,8 @@ void pv_wait(u8 *ptr, u8 val)
 		goto out;
 
 	wait_for_interrupt();
+
+	trace_pv_wait(smp_processor_id());
 out:
 	local_irq_restore(flags);
 }
diff --git a/arch/riscv/kernel/trace_events_filter_paravirt.h b/arch/riscv/kernel/trace_events_filter_paravirt.h
new file mode 100644
index 000000000000..9ff5aa451b12
--- /dev/null
+++ b/arch/riscv/kernel/trace_events_filter_paravirt.h
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c), 2023 Alibaba Cloud
+ * Authors:
+ *	Guo Ren <guoren@linux.alibaba.com>
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM paravirt
+
+#if !defined(_TRACE_PARAVIRT_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_PARAVIRT_H
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(pv_kick,
+	TP_PROTO(int cpu, int target),
+	TP_ARGS(cpu, target),
+
+	TP_STRUCT__entry(
+		__field(int, cpu)
+		__field(int, target)
+	),
+
+	TP_fast_assign(
+		__entry->cpu = cpu;
+		__entry->target = target;
+	),
+
+	TP_printk("cpu %d kick target cpu %d",
+		__entry->cpu,
+		__entry->target
+	)
+);
+
+TRACE_EVENT(pv_wait,
+	TP_PROTO(int cpu),
+	TP_ARGS(cpu),
+
+	TP_STRUCT__entry(
+		__field(int, cpu)
+	),
+
+	TP_fast_assign(
+		__entry->cpu = cpu;
+	),
+
+	TP_printk("cpu %d out of wfi",
+		__entry->cpu
+	)
+);
+
+#endif /* _TRACE_PARAVIRT_H || TRACE_HEADER_MULTI_READ */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH ../../../arch/riscv/kernel/
+#define TRACE_INCLUDE_FILE trace_events_filter_paravirt
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:47   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:47 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The pv_ops belongs to x86 custom infrastructure and cleans up the
cna_configure_spin_lock_slowpath() with standard code. This is
preparation for riscv support CNA qspoinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/x86/include/asm/qspinlock.h |  3 ++-
 arch/x86/kernel/alternative.c    |  6 +++++-
 kernel/locking/qspinlock_cna.h   | 14 ++++++--------
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index f48a2a250e57..100adad70bf5 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -28,7 +28,8 @@ static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lo
 }
 
 #ifdef CONFIG_NUMA_AWARE_SPINLOCKS
-extern void cna_configure_spin_lock_slowpath(void);
+extern bool cna_configure_spin_lock_slowpath(void);
+extern void __cna_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
 #endif
 
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index c36df5aa3ab1..68b7392016c3 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1538,7 +1538,11 @@ void __init alternative_instructions(void)
 	paravirt_set_cap();
 
 #if defined(CONFIG_NUMA_AWARE_SPINLOCKS)
-	cna_configure_spin_lock_slowpath();
+	if (pv_ops.lock.queued_spin_lock_slowpath == native_queued_spin_lock_slowpath) {
+		if (cna_configure_spin_lock_slowpath())
+			pv_ops.lock.queued_spin_lock_slowpath =
+							__cna_queued_spin_lock_slowpath;
+	}
 #endif
 
 	/*
diff --git a/kernel/locking/qspinlock_cna.h b/kernel/locking/qspinlock_cna.h
index 17d56c739e57..5e297dc687d9 100644
--- a/kernel/locking/qspinlock_cna.h
+++ b/kernel/locking/qspinlock_cna.h
@@ -406,20 +406,18 @@ void __cna_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
  * multiple NUMA nodes in native environment, unless the user has
  * overridden this default behavior by setting the numa_spinlock flag.
  */
-void __init cna_configure_spin_lock_slowpath(void)
+bool __init cna_configure_spin_lock_slowpath(void)
 {
 
 	if (numa_spinlock_flag < 0)
-		return;
+		return false;
 
-	if (numa_spinlock_flag == 0 && (nr_node_ids < 2 ||
-		    pv_ops.lock.queued_spin_lock_slowpath !=
-			native_queued_spin_lock_slowpath))
-		return;
+	if (numa_spinlock_flag == 0 && nr_node_ids < 2)
+		return false;
 
 	cna_init_nodes();
 
-	pv_ops.lock.queued_spin_lock_slowpath = __cna_queued_spin_lock_slowpath;
-
 	pr_info("Enabling CNA spinlock\n");
+
+	return true;
 }
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
@ 2023-08-02 16:47   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:47 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

The pv_ops belongs to x86 custom infrastructure and cleans up the
cna_configure_spin_lock_slowpath() with standard code. This is
preparation for riscv support CNA qspoinlock.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/x86/include/asm/qspinlock.h |  3 ++-
 arch/x86/kernel/alternative.c    |  6 +++++-
 kernel/locking/qspinlock_cna.h   | 14 ++++++--------
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h
index f48a2a250e57..100adad70bf5 100644
--- a/arch/x86/include/asm/qspinlock.h
+++ b/arch/x86/include/asm/qspinlock.h
@@ -28,7 +28,8 @@ static __always_inline u32 queued_fetch_set_pending_acquire(struct qspinlock *lo
 }
 
 #ifdef CONFIG_NUMA_AWARE_SPINLOCKS
-extern void cna_configure_spin_lock_slowpath(void);
+extern bool cna_configure_spin_lock_slowpath(void);
+extern void __cna_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
 #endif
 
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index c36df5aa3ab1..68b7392016c3 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1538,7 +1538,11 @@ void __init alternative_instructions(void)
 	paravirt_set_cap();
 
 #if defined(CONFIG_NUMA_AWARE_SPINLOCKS)
-	cna_configure_spin_lock_slowpath();
+	if (pv_ops.lock.queued_spin_lock_slowpath == native_queued_spin_lock_slowpath) {
+		if (cna_configure_spin_lock_slowpath())
+			pv_ops.lock.queued_spin_lock_slowpath =
+							__cna_queued_spin_lock_slowpath;
+	}
 #endif
 
 	/*
diff --git a/kernel/locking/qspinlock_cna.h b/kernel/locking/qspinlock_cna.h
index 17d56c739e57..5e297dc687d9 100644
--- a/kernel/locking/qspinlock_cna.h
+++ b/kernel/locking/qspinlock_cna.h
@@ -406,20 +406,18 @@ void __cna_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
  * multiple NUMA nodes in native environment, unless the user has
  * overridden this default behavior by setting the numa_spinlock flag.
  */
-void __init cna_configure_spin_lock_slowpath(void)
+bool __init cna_configure_spin_lock_slowpath(void)
 {
 
 	if (numa_spinlock_flag < 0)
-		return;
+		return false;
 
-	if (numa_spinlock_flag == 0 && (nr_node_ids < 2 ||
-		    pv_ops.lock.queued_spin_lock_slowpath !=
-			native_queued_spin_lock_slowpath))
-		return;
+	if (numa_spinlock_flag == 0 && nr_node_ids < 2)
+		return false;
 
 	cna_init_nodes();
 
-	pv_ops.lock.queued_spin_lock_slowpath = __cna_queued_spin_lock_slowpath;
-
 	pr_info("Enabling CNA spinlock\n");
+
+	return true;
 }
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 19/19] locking/qspinlock: riscv: Add Compact NUMA-aware lock support
  2023-08-02 16:46 ` guoren
@ 2023-08-02 16:47   ` guoren
  -1 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:47 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Connect riscv to Compact NUMA-aware lock (CNA), which uses
PRARAVIRT_SPINLOCKS static_call hooks. See numa_spinlock= of
Documentation/admin-guide/kernel-parameters.txt for trying.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig                 | 18 ++++++++++++++++++
 arch/riscv/include/asm/qspinlock.h |  5 +++++
 arch/riscv/kernel/paravirt.c       | 12 +++++++++++-
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 13f345b54581..ff483ccd26b9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -800,6 +800,24 @@ config PARAVIRT_SPINLOCKS
 
 	  If you are unsure how to answer this question, answer Y.
 
+config NUMA_AWARE_SPINLOCKS
+	bool "Numa-aware spinlocks"
+	depends on NUMA
+	depends on QUEUED_SPINLOCKS
+	depends on 64BIT
+	# For now, we depend on PARAVIRT_SPINLOCKS to make the patching work.
+	depends on PARAVIRT_SPINLOCKS
+	default y
+	help
+	  Introduce NUMA (Non Uniform Memory Access) awareness into
+	  the slow path of spinlocks.
+
+	  In this variant of qspinlock, the kernel will try to keep the lock
+	  on the same node, thus reducing the number of remote cache misses,
+	  while trading some of the short term fairness for better performance.
+
+	  Say N if you want absolute first come first serve fairness.
+
 endmenu # "Kernel features"
 
 menu "Boot options"
diff --git a/arch/riscv/include/asm/qspinlock.h b/arch/riscv/include/asm/qspinlock.h
index 003e9560a0d1..e6f2a0621af0 100644
--- a/arch/riscv/include/asm/qspinlock.h
+++ b/arch/riscv/include/asm/qspinlock.h
@@ -12,6 +12,11 @@ void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
 void __pv_init_lock_hash(void);
 void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
 
+#ifdef CONFIG_NUMA_AWARE_SPINLOCKS
+bool cna_configure_spin_lock_slowpath(void);
+void __cna_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+#endif
+
 static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 {
 	static_call(pv_queued_spin_lock_slowpath)(lock, val);
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index cc80e968ab13..9466f693a98c 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -193,8 +193,10 @@ void __init pv_qspinlock_init(void)
 	if (num_possible_cpus() == 1)
 		return;
 
-	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)
+	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) {
+		goto cna_qspinlock;
 		return;
+	}
 
 	if (!sbi_probe_extension(SBI_EXT_PVLOCK))
 		return;
@@ -204,5 +206,13 @@ void __init pv_qspinlock_init(void)
 
 	static_call_update(pv_queued_spin_lock_slowpath, __pv_queued_spin_lock_slowpath);
 	static_call_update(pv_queued_spin_unlock, __pv_queued_spin_unlock);
+	return;
+
+cna_qspinlock:
+#ifdef CONFIG_NUMA_AWARE_SPINLOCKS
+	if (cna_configure_spin_lock_slowpath())
+		static_call_update(pv_queued_spin_lock_slowpath,
+					__cna_queued_spin_lock_slowpath);
+#endif
 }
 #endif
-- 
2.36.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH V10 19/19] locking/qspinlock: riscv: Add Compact NUMA-aware lock support
@ 2023-08-02 16:47   ` guoren
  0 siblings, 0 replies; 77+ messages in thread
From: guoren @ 2023-08-02 16:47 UTC (permalink / raw)
  To: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren, Guo Ren

From: Guo Ren <guoren@linux.alibaba.com>

Connect riscv to Compact NUMA-aware lock (CNA), which uses
PRARAVIRT_SPINLOCKS static_call hooks. See numa_spinlock= of
Documentation/admin-guide/kernel-parameters.txt for trying.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/Kconfig                 | 18 ++++++++++++++++++
 arch/riscv/include/asm/qspinlock.h |  5 +++++
 arch/riscv/kernel/paravirt.c       | 12 +++++++++++-
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 13f345b54581..ff483ccd26b9 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -800,6 +800,24 @@ config PARAVIRT_SPINLOCKS
 
 	  If you are unsure how to answer this question, answer Y.
 
+config NUMA_AWARE_SPINLOCKS
+	bool "Numa-aware spinlocks"
+	depends on NUMA
+	depends on QUEUED_SPINLOCKS
+	depends on 64BIT
+	# For now, we depend on PARAVIRT_SPINLOCKS to make the patching work.
+	depends on PARAVIRT_SPINLOCKS
+	default y
+	help
+	  Introduce NUMA (Non Uniform Memory Access) awareness into
+	  the slow path of spinlocks.
+
+	  In this variant of qspinlock, the kernel will try to keep the lock
+	  on the same node, thus reducing the number of remote cache misses,
+	  while trading some of the short term fairness for better performance.
+
+	  Say N if you want absolute first come first serve fairness.
+
 endmenu # "Kernel features"
 
 menu "Boot options"
diff --git a/arch/riscv/include/asm/qspinlock.h b/arch/riscv/include/asm/qspinlock.h
index 003e9560a0d1..e6f2a0621af0 100644
--- a/arch/riscv/include/asm/qspinlock.h
+++ b/arch/riscv/include/asm/qspinlock.h
@@ -12,6 +12,11 @@ void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
 void __pv_init_lock_hash(void);
 void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
 
+#ifdef CONFIG_NUMA_AWARE_SPINLOCKS
+bool cna_configure_spin_lock_slowpath(void);
+void __cna_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val);
+#endif
+
 static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 {
 	static_call(pv_queued_spin_lock_slowpath)(lock, val);
diff --git a/arch/riscv/kernel/paravirt.c b/arch/riscv/kernel/paravirt.c
index cc80e968ab13..9466f693a98c 100644
--- a/arch/riscv/kernel/paravirt.c
+++ b/arch/riscv/kernel/paravirt.c
@@ -193,8 +193,10 @@ void __init pv_qspinlock_init(void)
 	if (num_possible_cpus() == 1)
 		return;
 
-	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)
+	if(sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) {
+		goto cna_qspinlock;
 		return;
+	}
 
 	if (!sbi_probe_extension(SBI_EXT_PVLOCK))
 		return;
@@ -204,5 +206,13 @@ void __init pv_qspinlock_init(void)
 
 	static_call_update(pv_queued_spin_lock_slowpath, __pv_queued_spin_lock_slowpath);
 	static_call_update(pv_queued_spin_unlock, __pv_queued_spin_unlock);
+	return;
+
+cna_qspinlock:
+#ifdef CONFIG_NUMA_AWARE_SPINLOCKS
+	if (cna_configure_spin_lock_slowpath())
+		static_call_update(pv_queued_spin_lock_slowpath,
+					__cna_queued_spin_lock_slowpath);
+#endif
 }
 #endif
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-02 16:46   ` guoren
@ 2023-08-04  9:05     ` Conor Dooley
  -1 siblings, 0 replies; 77+ messages in thread
From: Conor Dooley @ 2023-08-04  9:05 UTC (permalink / raw)
  To: guoren
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren


[-- Attachment #1.1: Type: text/plain, Size: 2972 bytes --]

Hey Guo Ren,

On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> According to qspinlock requirements, RISC-V gives out a weak LR/SC
> forward progress guarantee which does not satisfy qspinlock. But
> many vendors could produce stronger forward guarantee LR/SC to
> ensure the xchg_tail could be finished in time on any kind of
> hart. T-HEAD is the vendor which implements strong forward
> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> with errata help.
> 
> T-HEAD early version of processors has the merge buffer delay
> problem, so we need ERRATA_WRITEONCE to support qspinlock.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>  5 files changed, 61 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> index 4745a5c57e7c..eb43677b13cc 100644
> --- a/arch/riscv/Kconfig.errata
> +++ b/arch/riscv/Kconfig.errata
> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>  
>  	  If you don't know what to do here, say "Y".
>  
> +config ERRATA_THEAD_QSPINLOCK
> +	bool "Apply T-Head queued spinlock errata"
> +	depends on ERRATA_THEAD
> +	default y
> +	help
> +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> +	  match the xchg_tail requirement of qspinlock.
> +
> +	  This will apply the QSPINLOCK errata to handle the non-standard
> +	  behavior via using qspinlock instead of ticket_lock.

Whatever about the acceptability of anything else in this series,
having _stronger_ guarantees is not an erratum, is it? We should not
abuse the errata stuff for this IMO.

> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index f8dbbe1bbd34..d9694fe40a9a 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
>  		 * spinlock value, the only way is to change from queued_spinlock to
>  		 * ticket_spinlock, but can not be vice.
>  		 */
> -		if (!force_qspinlock) {
> +		if (!force_qspinlock &&
> +		    !riscv_has_errata_thead_qspinlock()) {
>  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);

Is this a generic vendor extension (lol @ that misnomer) or is it an
erratum? Make your mind up please. As has been said on other series, NAK
to using march/vendor/imp IDs for feature probing.

I've got some thoughts on other parts of this series too, but I'm not
going to spend time on it unless the locking people and Palmer ascent
to this series.

Cheers,
Conor.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-04  9:05     ` Conor Dooley
  0 siblings, 0 replies; 77+ messages in thread
From: Conor Dooley @ 2023-08-04  9:05 UTC (permalink / raw)
  To: guoren
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

[-- Attachment #1: Type: text/plain, Size: 2972 bytes --]

Hey Guo Ren,

On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> According to qspinlock requirements, RISC-V gives out a weak LR/SC
> forward progress guarantee which does not satisfy qspinlock. But
> many vendors could produce stronger forward guarantee LR/SC to
> ensure the xchg_tail could be finished in time on any kind of
> hart. T-HEAD is the vendor which implements strong forward
> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> with errata help.
> 
> T-HEAD early version of processors has the merge buffer delay
> problem, so we need ERRATA_WRITEONCE to support qspinlock.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>  5 files changed, 61 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> index 4745a5c57e7c..eb43677b13cc 100644
> --- a/arch/riscv/Kconfig.errata
> +++ b/arch/riscv/Kconfig.errata
> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>  
>  	  If you don't know what to do here, say "Y".
>  
> +config ERRATA_THEAD_QSPINLOCK
> +	bool "Apply T-Head queued spinlock errata"
> +	depends on ERRATA_THEAD
> +	default y
> +	help
> +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> +	  match the xchg_tail requirement of qspinlock.
> +
> +	  This will apply the QSPINLOCK errata to handle the non-standard
> +	  behavior via using qspinlock instead of ticket_lock.

Whatever about the acceptability of anything else in this series,
having _stronger_ guarantees is not an erratum, is it? We should not
abuse the errata stuff for this IMO.

> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index f8dbbe1bbd34..d9694fe40a9a 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
>  		 * spinlock value, the only way is to change from queued_spinlock to
>  		 * ticket_spinlock, but can not be vice.
>  		 */
> -		if (!force_qspinlock) {
> +		if (!force_qspinlock &&
> +		    !riscv_has_errata_thead_qspinlock()) {
>  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);

Is this a generic vendor extension (lol @ that misnomer) or is it an
erratum? Make your mind up please. As has been said on other series, NAK
to using march/vendor/imp IDs for feature probing.

I've got some thoughts on other parts of this series too, but I'm not
going to spend time on it unless the locking people and Palmer ascent
to this series.

Cheers,
Conor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-04  9:05     ` Conor Dooley
@ 2023-08-04  9:53       ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-04  9:53 UTC (permalink / raw)
  To: Conor Dooley
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Fri, Aug 4, 2023 at 5:06 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey Guo Ren,
>
> On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > According to qspinlock requirements, RISC-V gives out a weak LR/SC
> > forward progress guarantee which does not satisfy qspinlock. But
> > many vendors could produce stronger forward guarantee LR/SC to
> > ensure the xchg_tail could be finished in time on any kind of
> > hart. T-HEAD is the vendor which implements strong forward
> > guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> > with errata help.
> >
> > T-HEAD early version of processors has the merge buffer delay
> > problem, so we need ERRATA_WRITEONCE to support qspinlock.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  arch/riscv/Kconfig.errata              | 13 +++++++++++++
> >  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
> >  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
> >  arch/riscv/include/asm/vendorid_list.h |  3 ++-
> >  arch/riscv/kernel/cpufeature.c         |  3 ++-
> >  5 files changed, 61 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> > index 4745a5c57e7c..eb43677b13cc 100644
> > --- a/arch/riscv/Kconfig.errata
> > +++ b/arch/riscv/Kconfig.errata
> > @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> >
> >         If you don't know what to do here, say "Y".
> >
> > +config ERRATA_THEAD_QSPINLOCK
> > +     bool "Apply T-Head queued spinlock errata"
> > +     depends on ERRATA_THEAD
> > +     default y
> > +     help
> > +       The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> > +       match the xchg_tail requirement of qspinlock.
> > +
> > +       This will apply the QSPINLOCK errata to handle the non-standard
> > +       behavior via using qspinlock instead of ticket_lock.
>
> Whatever about the acceptability of anything else in this series,
> having _stronger_ guarantees is not an erratum, is it? We should not
> abuse the errata stuff for this IMO.
>
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index f8dbbe1bbd34..d9694fe40a9a 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> >                * spinlock value, the only way is to change from queued_spinlock to
> >                * ticket_spinlock, but can not be vice.
> >                */
> > -             if (!force_qspinlock) {
> > +             if (!force_qspinlock &&
> > +                 !riscv_has_errata_thead_qspinlock()) {
> >                       set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
>
> Is this a generic vendor extension (lol @ that misnomer) or is it an
> erratum? Make your mind up please. As has been said on other series, NAK
> to using march/vendor/imp IDs for feature probing.
The RISCV_ISA_EXT_XTICKETLOCK is a feature extension number, and it's
set by default for forward-compatible. We also define a vendor
extension (riscv_has_errata_thead_qspinlock) to force all our
processors to use qspinlock; others still stay on ticket_lock.

The only possible changing direction is from qspinlock to ticket_lock
because ticket_lock would dirty the lock value, which prevents
changing to qspinlock next. So startup with qspinlock and change to
ticket_lock before smp up. You also could use cmdline to try qspinlock
(force_qspinlock).

>
> I've got some thoughts on other parts of this series too, but I'm not
> going to spend time on it unless the locking people and Palmer ascent
> to this series.
>
> Cheers,
> Conor.



-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-04  9:53       ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-04  9:53 UTC (permalink / raw)
  To: Conor Dooley
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Fri, Aug 4, 2023 at 5:06 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> Hey Guo Ren,
>
> On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > According to qspinlock requirements, RISC-V gives out a weak LR/SC
> > forward progress guarantee which does not satisfy qspinlock. But
> > many vendors could produce stronger forward guarantee LR/SC to
> > ensure the xchg_tail could be finished in time on any kind of
> > hart. T-HEAD is the vendor which implements strong forward
> > guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> > with errata help.
> >
> > T-HEAD early version of processors has the merge buffer delay
> > problem, so we need ERRATA_WRITEONCE to support qspinlock.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  arch/riscv/Kconfig.errata              | 13 +++++++++++++
> >  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
> >  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
> >  arch/riscv/include/asm/vendorid_list.h |  3 ++-
> >  arch/riscv/kernel/cpufeature.c         |  3 ++-
> >  5 files changed, 61 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> > index 4745a5c57e7c..eb43677b13cc 100644
> > --- a/arch/riscv/Kconfig.errata
> > +++ b/arch/riscv/Kconfig.errata
> > @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> >
> >         If you don't know what to do here, say "Y".
> >
> > +config ERRATA_THEAD_QSPINLOCK
> > +     bool "Apply T-Head queued spinlock errata"
> > +     depends on ERRATA_THEAD
> > +     default y
> > +     help
> > +       The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> > +       match the xchg_tail requirement of qspinlock.
> > +
> > +       This will apply the QSPINLOCK errata to handle the non-standard
> > +       behavior via using qspinlock instead of ticket_lock.
>
> Whatever about the acceptability of anything else in this series,
> having _stronger_ guarantees is not an erratum, is it? We should not
> abuse the errata stuff for this IMO.
>
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index f8dbbe1bbd34..d9694fe40a9a 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> >                * spinlock value, the only way is to change from queued_spinlock to
> >                * ticket_spinlock, but can not be vice.
> >                */
> > -             if (!force_qspinlock) {
> > +             if (!force_qspinlock &&
> > +                 !riscv_has_errata_thead_qspinlock()) {
> >                       set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
>
> Is this a generic vendor extension (lol @ that misnomer) or is it an
> erratum? Make your mind up please. As has been said on other series, NAK
> to using march/vendor/imp IDs for feature probing.
The RISCV_ISA_EXT_XTICKETLOCK is a feature extension number, and it's
set by default for forward-compatible. We also define a vendor
extension (riscv_has_errata_thead_qspinlock) to force all our
processors to use qspinlock; others still stay on ticket_lock.

The only possible changing direction is from qspinlock to ticket_lock
because ticket_lock would dirty the lock value, which prevents
changing to qspinlock next. So startup with qspinlock and change to
ticket_lock before smp up. You also could use cmdline to try qspinlock
(force_qspinlock).

>
> I've got some thoughts on other parts of this series too, but I'm not
> going to spend time on it unless the locking people and Palmer ascent
> to this series.
>
> Cheers,
> Conor.



-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-04  9:53       ` Guo Ren
@ 2023-08-04 10:06         ` Conor Dooley
  -1 siblings, 0 replies; 77+ messages in thread
From: Conor Dooley @ 2023-08-04 10:06 UTC (permalink / raw)
  To: Guo Ren
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

[-- Attachment #1: Type: text/plain, Size: 2714 bytes --]

On Fri, Aug 04, 2023 at 05:53:35PM +0800, Guo Ren wrote:
> On Fri, Aug 4, 2023 at 5:06 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> > > From: Guo Ren <guoren@linux.alibaba.com>

> > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > > index f8dbbe1bbd34..d9694fe40a9a 100644
> > > --- a/arch/riscv/kernel/cpufeature.c
> > > +++ b/arch/riscv/kernel/cpufeature.c
> > > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> > >                * spinlock value, the only way is to change from queued_spinlock to
> > >                * ticket_spinlock, but can not be vice.
> > >                */
> > > -             if (!force_qspinlock) {
> > > +             if (!force_qspinlock &&
> > > +                 !riscv_has_errata_thead_qspinlock()) {
> > >                       set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> >
> > Is this a generic vendor extension (lol @ that misnomer) or is it an
> > erratum? Make your mind up please. As has been said on other series, NAK
> > to using march/vendor/imp IDs for feature probing.
>
> The RISCV_ISA_EXT_XTICKETLOCK is a feature extension number,

No, that is not what "ISA_EXT" means, nor what the X in "XTICKETLOCK"
would imply.

The comment above these reads:
  These macros represent the logical IDs of each multi-letter RISC-V ISA
  extension and are used in the ISA bitmap.

> and it's
> set by default for forward-compatible. We also define a vendor
> extension (riscv_has_errata_thead_qspinlock) to force all our
> processors to use qspinlock; others still stay on ticket_lock.

No, "riscv_has_errata_thead_qspinlock()" would be an _erratum_, not a
vendor extension. We need to have a discussion about how to support
non-standard extensions etc, not abuse errata. That discussion has been
started on the v0.7.1 vector patches, but has not made progress yet.

> The only possible changing direction is from qspinlock to ticket_lock
> because ticket_lock would dirty the lock value, which prevents
> changing to qspinlock next. So startup with qspinlock and change to
> ticket_lock before smp up. You also could use cmdline to try qspinlock
> (force_qspinlock).

I don't see what the relevance of this is, sorry. I am only commenting
on how you are deciding that the hardware is capable of using qspinlocks,
I don't intend getting into the detail unless the powers that be deem
this series worthwhile, as I mentioned:
> > I've got some thoughts on other parts of this series too, but I'm not
> > going to spend time on it unless the locking people and Palmer ascent
> > to this series.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-04 10:06         ` Conor Dooley
  0 siblings, 0 replies; 77+ messages in thread
From: Conor Dooley @ 2023-08-04 10:06 UTC (permalink / raw)
  To: Guo Ren
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren


[-- Attachment #1.1: Type: text/plain, Size: 2714 bytes --]

On Fri, Aug 04, 2023 at 05:53:35PM +0800, Guo Ren wrote:
> On Fri, Aug 4, 2023 at 5:06 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> > > From: Guo Ren <guoren@linux.alibaba.com>

> > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > > index f8dbbe1bbd34..d9694fe40a9a 100644
> > > --- a/arch/riscv/kernel/cpufeature.c
> > > +++ b/arch/riscv/kernel/cpufeature.c
> > > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> > >                * spinlock value, the only way is to change from queued_spinlock to
> > >                * ticket_spinlock, but can not be vice.
> > >                */
> > > -             if (!force_qspinlock) {
> > > +             if (!force_qspinlock &&
> > > +                 !riscv_has_errata_thead_qspinlock()) {
> > >                       set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> >
> > Is this a generic vendor extension (lol @ that misnomer) or is it an
> > erratum? Make your mind up please. As has been said on other series, NAK
> > to using march/vendor/imp IDs for feature probing.
>
> The RISCV_ISA_EXT_XTICKETLOCK is a feature extension number,

No, that is not what "ISA_EXT" means, nor what the X in "XTICKETLOCK"
would imply.

The comment above these reads:
  These macros represent the logical IDs of each multi-letter RISC-V ISA
  extension and are used in the ISA bitmap.

> and it's
> set by default for forward-compatible. We also define a vendor
> extension (riscv_has_errata_thead_qspinlock) to force all our
> processors to use qspinlock; others still stay on ticket_lock.

No, "riscv_has_errata_thead_qspinlock()" would be an _erratum_, not a
vendor extension. We need to have a discussion about how to support
non-standard extensions etc, not abuse errata. That discussion has been
started on the v0.7.1 vector patches, but has not made progress yet.

> The only possible changing direction is from qspinlock to ticket_lock
> because ticket_lock would dirty the lock value, which prevents
> changing to qspinlock next. So startup with qspinlock and change to
> ticket_lock before smp up. You also could use cmdline to try qspinlock
> (force_qspinlock).

I don't see what the relevance of this is, sorry. I am only commenting
on how you are deciding that the hardware is capable of using qspinlocks,
I don't intend getting into the detail unless the powers that be deem
this series worthwhile, as I mentioned:
> > I've got some thoughts on other parts of this series too, but I'm not
> > going to spend time on it unless the locking people and Palmer ascent
> > to this series.


[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

[-- Attachment #2: Type: text/plain, Size: 161 bytes --]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-04 10:06         ` Conor Dooley
@ 2023-08-05  1:28           ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-05  1:28 UTC (permalink / raw)
  To: Conor Dooley
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Fri, Aug 4, 2023 at 6:07 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Fri, Aug 04, 2023 at 05:53:35PM +0800, Guo Ren wrote:
> > On Fri, Aug 4, 2023 at 5:06 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > > On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> > > > From: Guo Ren <guoren@linux.alibaba.com>
>
> > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > > > index f8dbbe1bbd34..d9694fe40a9a 100644
> > > > --- a/arch/riscv/kernel/cpufeature.c
> > > > +++ b/arch/riscv/kernel/cpufeature.c
> > > > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> > > >                * spinlock value, the only way is to change from queued_spinlock to
> > > >                * ticket_spinlock, but can not be vice.
> > > >                */
> > > > -             if (!force_qspinlock) {
> > > > +             if (!force_qspinlock &&
> > > > +                 !riscv_has_errata_thead_qspinlock()) {
> > > >                       set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> > >
> > > Is this a generic vendor extension (lol @ that misnomer) or is it an
> > > erratum? Make your mind up please. As has been said on other series, NAK
> > > to using march/vendor/imp IDs for feature probing.
> >
> > The RISCV_ISA_EXT_XTICKETLOCK is a feature extension number,
>
> No, that is not what "ISA_EXT" means, nor what the X in "XTICKETLOCK"
> would imply.
>
> The comment above these reads:
>   These macros represent the logical IDs of each multi-letter RISC-V ISA
>   extension and are used in the ISA bitmap.
>
> > and it's
> > set by default for forward-compatible. We also define a vendor
> > extension (riscv_has_errata_thead_qspinlock) to force all our
> > processors to use qspinlock; others still stay on ticket_lock.
>
> No, "riscv_has_errata_thead_qspinlock()" would be an _erratum_, not a
> vendor extension. We need to have a discussion about how to support
> non-standard extensions etc, not abuse errata. That discussion has been
> started on the v0.7.1 vector patches, but has not made progress yet.
You convinced me, yes, I abuse errata here. I would change to Linux
standard static_key mechanism next.

>
> > The only possible changing direction is from qspinlock to ticket_lock
> > because ticket_lock would dirty the lock value, which prevents
> > changing to qspinlock next. So startup with qspinlock and change to
> > ticket_lock before smp up. You also could use cmdline to try qspinlock
> > (force_qspinlock).
>
> I don't see what the relevance of this is, sorry. I am only commenting
> on how you are deciding that the hardware is capable of using qspinlocks,
> I don't intend getting into the detail unless the powers that be deem
> this series worthwhile, as I mentioned:
> > > I've got some thoughts on other parts of this series too, but I'm not
> > > going to spend time on it unless the locking people and Palmer ascent
> > > to this series.
>

--
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-05  1:28           ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-05  1:28 UTC (permalink / raw)
  To: Conor Dooley
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Fri, Aug 4, 2023 at 6:07 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Fri, Aug 04, 2023 at 05:53:35PM +0800, Guo Ren wrote:
> > On Fri, Aug 4, 2023 at 5:06 PM Conor Dooley <conor.dooley@microchip.com> wrote:
> > > On Wed, Aug 02, 2023 at 12:46:49PM -0400, guoren@kernel.org wrote:
> > > > From: Guo Ren <guoren@linux.alibaba.com>
>
> > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > > > index f8dbbe1bbd34..d9694fe40a9a 100644
> > > > --- a/arch/riscv/kernel/cpufeature.c
> > > > +++ b/arch/riscv/kernel/cpufeature.c
> > > > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> > > >                * spinlock value, the only way is to change from queued_spinlock to
> > > >                * ticket_spinlock, but can not be vice.
> > > >                */
> > > > -             if (!force_qspinlock) {
> > > > +             if (!force_qspinlock &&
> > > > +                 !riscv_has_errata_thead_qspinlock()) {
> > > >                       set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> > >
> > > Is this a generic vendor extension (lol @ that misnomer) or is it an
> > > erratum? Make your mind up please. As has been said on other series, NAK
> > > to using march/vendor/imp IDs for feature probing.
> >
> > The RISCV_ISA_EXT_XTICKETLOCK is a feature extension number,
>
> No, that is not what "ISA_EXT" means, nor what the X in "XTICKETLOCK"
> would imply.
>
> The comment above these reads:
>   These macros represent the logical IDs of each multi-letter RISC-V ISA
>   extension and are used in the ISA bitmap.
>
> > and it's
> > set by default for forward-compatible. We also define a vendor
> > extension (riscv_has_errata_thead_qspinlock) to force all our
> > processors to use qspinlock; others still stay on ticket_lock.
>
> No, "riscv_has_errata_thead_qspinlock()" would be an _erratum_, not a
> vendor extension. We need to have a discussion about how to support
> non-standard extensions etc, not abuse errata. That discussion has been
> started on the v0.7.1 vector patches, but has not made progress yet.
You convinced me, yes, I abuse errata here. I would change to Linux
standard static_key mechanism next.

>
> > The only possible changing direction is from qspinlock to ticket_lock
> > because ticket_lock would dirty the lock value, which prevents
> > changing to qspinlock next. So startup with qspinlock and change to
> > ticket_lock before smp up. You also could use cmdline to try qspinlock
> > (force_qspinlock).
>
> I don't see what the relevance of this is, sorry. I am only commenting
> on how you are deciding that the hardware is capable of using qspinlocks,
> I don't intend getting into the detail unless the powers that be deem
> this series worthwhile, as I mentioned:
> > > I've got some thoughts on other parts of this series too, but I'm not
> > > going to spend time on it unless the locking people and Palmer ascent
> > > to this series.
>

--
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-02 16:46   ` guoren
@ 2023-08-07  5:23     ` Stefan O'Rear
  -1 siblings, 0 replies; 77+ messages in thread
From: Stefan O'Rear @ 2023-08-07  5:23 UTC (permalink / raw)
  To: guoren, paul.walmsley, Anup Patel, peterz, mingo, will,
	Palmer Dabbelt, longman, boqun.feng, tglx, paulmck, rostedt,
	rdunlap, catalin.marinas, Conor Dooley, xiaoguang.xing,
	Björn Töpel, alexghiti, Kees Cook, greentime.hu,
	Andrew Jones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> According to qspinlock requirements, RISC-V gives out a weak LR/SC
> forward progress guarantee which does not satisfy qspinlock. But
> many vendors could produce stronger forward guarantee LR/SC to
> ensure the xchg_tail could be finished in time on any kind of
> hart. T-HEAD is the vendor which implements strong forward
> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> with errata help.
>
> T-HEAD early version of processors has the merge buffer delay
> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>  5 files changed, 61 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> index 4745a5c57e7c..eb43677b13cc 100644
> --- a/arch/riscv/Kconfig.errata
> +++ b/arch/riscv/Kconfig.errata
> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> 
>  	  If you don't know what to do here, say "Y".
> 
> +config ERRATA_THEAD_QSPINLOCK
> +	bool "Apply T-Head queued spinlock errata"
> +	depends on ERRATA_THEAD
> +	default y
> +	help
> +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> +	  match the xchg_tail requirement of qspinlock.
> +
> +	  This will apply the QSPINLOCK errata to handle the non-standard
> +	  behavior via using qspinlock instead of ticket_lock.
> +
> +	  If you don't know what to do here, say "Y".

If this is to be applied, I would like to see a detailed explanation somewhere,
preferably with citations, of:

(a) The memory model requirements for qspinlock
(b) Why, with arguments, RISC-V does not architecturally meet (a)
(c) Why, with arguments, T-HEAD C9xx meets (a)
(d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
    meets (a)

As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
(livelock freedom but no starvation freedom) are exactly the same as those in
Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
disagree; I would like to see your perspective.

-s

> +
>  endmenu # "CPU errata selection"
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index 881729746d2e..d560dc45c0e7 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
>  	return false;
>  }
> 
> +static bool errata_probe_qspinlock(unsigned int stage,
> +				   unsigned long arch_id, unsigned long impid)
> +{
> +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
> +		return false;
> +
> +	/*
> +	 * The queued_spinlock torture would get in livelock without
> +	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
> +	 * processors.
> +	 */
> +	if (arch_id == 0 && impid == 0 &&
> +	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
> +		return false;
> +
> +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> +		return true;
> +
> +	return false;
> +}
> +
>  static u32 thead_errata_probe(unsigned int stage,
>  			      unsigned long archid, unsigned long impid)
>  {
> @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
>  	if (errata_probe_write_once(stage, archid, impid))
>  		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
> 
> +	if (errata_probe_qspinlock(stage, archid, impid))
> +		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
> +
>  	return cpu_req_errata;
>  }
> 
> diff --git a/arch/riscv/include/asm/errata_list.h 
> b/arch/riscv/include/asm/errata_list.h
> index fbb2b8d39321..a696d18d1b0d 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
>  	: "=r" (__ovl) :						\
>  	: "memory")
> 
> +static __always_inline bool
> +riscv_has_errata_thead_qspinlock(void)
> +{
> +	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> +		asm_volatile_goto(
> +		ALTERNATIVE(
> +		"j	%l[l_no]", "nop",
> +		THEAD_VENDOR_ID,
> +		ERRATA_THEAD_QSPINLOCK,
> +		CONFIG_ERRATA_THEAD_QSPINLOCK)
> +		: : : : l_no);
> +	} else {
> +		goto l_no;
> +	}
> +
> +	return true;
> +l_no:
> +	return false;
> +}
> +
>  #endif /* __ASSEMBLY__ */
> 
>  #endif
> diff --git a/arch/riscv/include/asm/vendorid_list.h 
> b/arch/riscv/include/asm/vendorid_list.h
> index 73078cfe4029..1f1d03877f5f 100644
> --- a/arch/riscv/include/asm/vendorid_list.h
> +++ b/arch/riscv/include/asm/vendorid_list.h
> @@ -19,7 +19,8 @@
>  #define	ERRATA_THEAD_CMO 1
>  #define	ERRATA_THEAD_PMU 2
>  #define	ERRATA_THEAD_WRITE_ONCE 3
> -#define	ERRATA_THEAD_NUMBER 4
> +#define	ERRATA_THEAD_QSPINLOCK 4
> +#define	ERRATA_THEAD_NUMBER 5
>  #endif
> 
>  #endif
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index f8dbbe1bbd34..d9694fe40a9a 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
>  		 * spinlock value, the only way is to change from queued_spinlock to
>  		 * ticket_spinlock, but can not be vice.
>  		 */
> -		if (!force_qspinlock) {
> +		if (!force_qspinlock &&
> +		    !riscv_has_errata_thead_qspinlock()) {
>  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
>  		}
>  #endif
> -- 
> 2.36.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-07  5:23     ` Stefan O'Rear
  0 siblings, 0 replies; 77+ messages in thread
From: Stefan O'Rear @ 2023-08-07  5:23 UTC (permalink / raw)
  To: guoren, paul.walmsley, Anup Patel, peterz, mingo, will,
	Palmer Dabbelt, longman, boqun.feng, tglx, paulmck, rostedt,
	rdunlap, catalin.marinas, Conor Dooley, xiaoguang.xing,
	Björn Töpel, alexghiti, Kees Cook, greentime.hu,
	Andrew Jones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> According to qspinlock requirements, RISC-V gives out a weak LR/SC
> forward progress guarantee which does not satisfy qspinlock. But
> many vendors could produce stronger forward guarantee LR/SC to
> ensure the xchg_tail could be finished in time on any kind of
> hart. T-HEAD is the vendor which implements strong forward
> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> with errata help.
>
> T-HEAD early version of processors has the merge buffer delay
> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>  5 files changed, 61 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> index 4745a5c57e7c..eb43677b13cc 100644
> --- a/arch/riscv/Kconfig.errata
> +++ b/arch/riscv/Kconfig.errata
> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> 
>  	  If you don't know what to do here, say "Y".
> 
> +config ERRATA_THEAD_QSPINLOCK
> +	bool "Apply T-Head queued spinlock errata"
> +	depends on ERRATA_THEAD
> +	default y
> +	help
> +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> +	  match the xchg_tail requirement of qspinlock.
> +
> +	  This will apply the QSPINLOCK errata to handle the non-standard
> +	  behavior via using qspinlock instead of ticket_lock.
> +
> +	  If you don't know what to do here, say "Y".

If this is to be applied, I would like to see a detailed explanation somewhere,
preferably with citations, of:

(a) The memory model requirements for qspinlock
(b) Why, with arguments, RISC-V does not architecturally meet (a)
(c) Why, with arguments, T-HEAD C9xx meets (a)
(d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
    meets (a)

As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
(livelock freedom but no starvation freedom) are exactly the same as those in
Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
disagree; I would like to see your perspective.

-s

> +
>  endmenu # "CPU errata selection"
> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> index 881729746d2e..d560dc45c0e7 100644
> --- a/arch/riscv/errata/thead/errata.c
> +++ b/arch/riscv/errata/thead/errata.c
> @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
>  	return false;
>  }
> 
> +static bool errata_probe_qspinlock(unsigned int stage,
> +				   unsigned long arch_id, unsigned long impid)
> +{
> +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
> +		return false;
> +
> +	/*
> +	 * The queued_spinlock torture would get in livelock without
> +	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
> +	 * processors.
> +	 */
> +	if (arch_id == 0 && impid == 0 &&
> +	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
> +		return false;
> +
> +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> +		return true;
> +
> +	return false;
> +}
> +
>  static u32 thead_errata_probe(unsigned int stage,
>  			      unsigned long archid, unsigned long impid)
>  {
> @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
>  	if (errata_probe_write_once(stage, archid, impid))
>  		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
> 
> +	if (errata_probe_qspinlock(stage, archid, impid))
> +		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
> +
>  	return cpu_req_errata;
>  }
> 
> diff --git a/arch/riscv/include/asm/errata_list.h 
> b/arch/riscv/include/asm/errata_list.h
> index fbb2b8d39321..a696d18d1b0d 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
>  	: "=r" (__ovl) :						\
>  	: "memory")
> 
> +static __always_inline bool
> +riscv_has_errata_thead_qspinlock(void)
> +{
> +	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> +		asm_volatile_goto(
> +		ALTERNATIVE(
> +		"j	%l[l_no]", "nop",
> +		THEAD_VENDOR_ID,
> +		ERRATA_THEAD_QSPINLOCK,
> +		CONFIG_ERRATA_THEAD_QSPINLOCK)
> +		: : : : l_no);
> +	} else {
> +		goto l_no;
> +	}
> +
> +	return true;
> +l_no:
> +	return false;
> +}
> +
>  #endif /* __ASSEMBLY__ */
> 
>  #endif
> diff --git a/arch/riscv/include/asm/vendorid_list.h 
> b/arch/riscv/include/asm/vendorid_list.h
> index 73078cfe4029..1f1d03877f5f 100644
> --- a/arch/riscv/include/asm/vendorid_list.h
> +++ b/arch/riscv/include/asm/vendorid_list.h
> @@ -19,7 +19,8 @@
>  #define	ERRATA_THEAD_CMO 1
>  #define	ERRATA_THEAD_PMU 2
>  #define	ERRATA_THEAD_WRITE_ONCE 3
> -#define	ERRATA_THEAD_NUMBER 4
> +#define	ERRATA_THEAD_QSPINLOCK 4
> +#define	ERRATA_THEAD_NUMBER 5
>  #endif
> 
>  #endif
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index f8dbbe1bbd34..d9694fe40a9a 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
>  		 * spinlock value, the only way is to change from queued_spinlock to
>  		 * ticket_spinlock, but can not be vice.
>  		 */
> -		if (!force_qspinlock) {
> +		if (!force_qspinlock &&
> +		    !riscv_has_errata_thead_qspinlock()) {
>  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
>  		}
>  #endif
> -- 
> 2.36.1
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-07  5:23     ` Stefan O'Rear
@ 2023-08-08  2:12       ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-08  2:12 UTC (permalink / raw)
  To: Stefan O'Rear
  Cc: paul.walmsley, Anup Patel, peterz, mingo, will, Palmer Dabbelt,
	longman, boqun.feng, tglx, paulmck, rostedt, rdunlap,
	catalin.marinas, Conor Dooley, xiaoguang.xing,
	Björn Töpel, alexghiti, Kees Cook, greentime.hu,
	Andrew Jones, jszhang, wefu, wuwei2016, linux-arch, linux-riscv,
	linux-doc, kvm, virtualization, linux-csky, Guo Ren, guoren

On Mon, Aug 07, 2023 at 01:23:34AM -0400, Stefan O'Rear wrote:
> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > According to qspinlock requirements, RISC-V gives out a weak LR/SC
> > forward progress guarantee which does not satisfy qspinlock. But
> > many vendors could produce stronger forward guarantee LR/SC to
> > ensure the xchg_tail could be finished in time on any kind of
> > hart. T-HEAD is the vendor which implements strong forward
> > guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> > with errata help.
> >
> > T-HEAD early version of processors has the merge buffer delay
> > problem, so we need ERRATA_WRITEONCE to support qspinlock.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  arch/riscv/Kconfig.errata              | 13 +++++++++++++
> >  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
> >  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
> >  arch/riscv/include/asm/vendorid_list.h |  3 ++-
> >  arch/riscv/kernel/cpufeature.c         |  3 ++-
> >  5 files changed, 61 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> > index 4745a5c57e7c..eb43677b13cc 100644
> > --- a/arch/riscv/Kconfig.errata
> > +++ b/arch/riscv/Kconfig.errata
> > @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> > 
> >  	  If you don't know what to do here, say "Y".
> > 
> > +config ERRATA_THEAD_QSPINLOCK
> > +	bool "Apply T-Head queued spinlock errata"
> > +	depends on ERRATA_THEAD
> > +	default y
> > +	help
> > +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> > +	  match the xchg_tail requirement of qspinlock.
> > +
> > +	  This will apply the QSPINLOCK errata to handle the non-standard
> > +	  behavior via using qspinlock instead of ticket_lock.
> > +
> > +	  If you don't know what to do here, say "Y".
> 
> If this is to be applied, I would like to see a detailed explanation somewhere,
> preferably with citations, of:
> 
> (a) The memory model requirements for qspinlock
These were written in commit: a8ad07e5240 ("asm-generic: qspinlock: Indicate the use of
mixed-size atomics"). For riscv, the most controversial point is xchg_tail()
implementation for native queued spinlock.

> (b) Why, with arguments, RISC-V does not architecturally meet (a)
In the spec "Eventual Success of Store-Conditional Instructions":
"By contrast, if other harts or devices continue to write to that reservation set, it is
not guaranteed that any hart will exit its LR/SC loop."

1. The arch_spinlock_t is 32-bit width, and it contains LOCK_PENDING
   part and IDX_TAIL part.
    - LOCK:     lock holder
    - PENDING:  next waiter (Only once per contended situation)
    - IDX:      nested context (normal, hwirq, softirq, nmi)
    - TAIL:     last contended cpu
   The xchg_tail operate on IDX_TAIL part, so there is no guarantee on "NO"
   "other harts or devices continue to write to that reservation set".

2. When you do lock torture test, you may see a long contended ring queue:
                                                                xchg_tail
                                                                    +-----> CPU4 (big core)
                                                                    |
   CPU3 (lock holder) -> CPU1 (mcs queued) -> CPU2 (mcs queued) ----+-----> CPU0 (little core)
    |                                                               |
    |                                                               +-----> CPU5 (big core)
    |                                                               |
    +--locktorture release lock (spin_unlock) and spin_lock again --+-----> CPU3 (big core)

    If CPU0 doesn't have a strong fwd guarantee, xhg_tail is consistently failed.

> (c) Why, with arguments, T-HEAD C9xx meets (a)
> (d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
>     meets (a)
I can't give the C9xx microarch implementation detail. But many
open-source riscv cores have provided strong forward progress guarantee
LR/SC implementation [1] [2]. But I would say these implementations are
too rude, which makes LR send a cacheline unique interconnect request.
It satisfies xchg_tail but not cmpxchg & cond_load. CPU vendors should
carefully consider your LR/SC fwd guarantee implementation.

[1]: https://github.com/riscv-boom/riscv-boom/blob/v3.0.0/src/main/scala/lsu/dcache.scala#L650
[2]: https://github.com/OpenXiangShan/XiangShan/blob/v1.0/src/main/scala/xiangshan/cache/MainPipe.scala#L470

> 
> As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
> (livelock freedom but no starvation freedom) are exactly the same as those in
> Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
> constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
> disagree; I would like to see your perspective.
For Armv8, I would use LSE for the lock-contended scenario. Ref this
commit 0ea366f5e1b6: ("arm64: atomics: prefetch the destination word for
write prior to stxr").

> 
> -s
> 
> > +
> >  endmenu # "CPU errata selection"
> > diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> > index 881729746d2e..d560dc45c0e7 100644
> > --- a/arch/riscv/errata/thead/errata.c
> > +++ b/arch/riscv/errata/thead/errata.c
> > @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
> >  	return false;
> >  }
> > 
> > +static bool errata_probe_qspinlock(unsigned int stage,
> > +				   unsigned long arch_id, unsigned long impid)
> > +{
> > +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
> > +		return false;
> > +
> > +	/*
> > +	 * The queued_spinlock torture would get in livelock without
> > +	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
> > +	 * processors.
> > +	 */
> > +	if (arch_id == 0 && impid == 0 &&
> > +	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
> > +		return false;
> > +
> > +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> >  static u32 thead_errata_probe(unsigned int stage,
> >  			      unsigned long archid, unsigned long impid)
> >  {
> > @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
> >  	if (errata_probe_write_once(stage, archid, impid))
> >  		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
> > 
> > +	if (errata_probe_qspinlock(stage, archid, impid))
> > +		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
> > +
> >  	return cpu_req_errata;
> >  }
> > 
> > diff --git a/arch/riscv/include/asm/errata_list.h 
> > b/arch/riscv/include/asm/errata_list.h
> > index fbb2b8d39321..a696d18d1b0d 100644
> > --- a/arch/riscv/include/asm/errata_list.h
> > +++ b/arch/riscv/include/asm/errata_list.h
> > @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
> >  	: "=r" (__ovl) :						\
> >  	: "memory")
> > 
> > +static __always_inline bool
> > +riscv_has_errata_thead_qspinlock(void)
> > +{
> > +	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> > +		asm_volatile_goto(
> > +		ALTERNATIVE(
> > +		"j	%l[l_no]", "nop",
> > +		THEAD_VENDOR_ID,
> > +		ERRATA_THEAD_QSPINLOCK,
> > +		CONFIG_ERRATA_THEAD_QSPINLOCK)
> > +		: : : : l_no);
> > +	} else {
> > +		goto l_no;
> > +	}
> > +
> > +	return true;
> > +l_no:
> > +	return false;
> > +}
> > +
> >  #endif /* __ASSEMBLY__ */
> > 
> >  #endif
> > diff --git a/arch/riscv/include/asm/vendorid_list.h 
> > b/arch/riscv/include/asm/vendorid_list.h
> > index 73078cfe4029..1f1d03877f5f 100644
> > --- a/arch/riscv/include/asm/vendorid_list.h
> > +++ b/arch/riscv/include/asm/vendorid_list.h
> > @@ -19,7 +19,8 @@
> >  #define	ERRATA_THEAD_CMO 1
> >  #define	ERRATA_THEAD_PMU 2
> >  #define	ERRATA_THEAD_WRITE_ONCE 3
> > -#define	ERRATA_THEAD_NUMBER 4
> > +#define	ERRATA_THEAD_QSPINLOCK 4
> > +#define	ERRATA_THEAD_NUMBER 5
> >  #endif
> > 
> >  #endif
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index f8dbbe1bbd34..d9694fe40a9a 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> >  		 * spinlock value, the only way is to change from queued_spinlock to
> >  		 * ticket_spinlock, but can not be vice.
> >  		 */
> > -		if (!force_qspinlock) {
> > +		if (!force_qspinlock &&
> > +		    !riscv_has_errata_thead_qspinlock()) {
> >  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> >  		}
> >  #endif
> > -- 
> > 2.36.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-08-08  2:12       ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-08  2:12 UTC (permalink / raw)
  To: Stefan O'Rear
  Cc: paul.walmsley, Anup Patel, peterz, mingo, will, Palmer Dabbelt,
	longman, boqun.feng, tglx, paulmck, rostedt, rdunlap,
	catalin.marinas, Conor Dooley, xiaoguang.xing,
	Björn Töpel, alexghiti, Kees Cook, greentime.hu,
	Andrew Jones, jszhang, wefu, wuwei2016, linux-arch, linux-riscv,
	linux-doc, kvm, virtualization, linux-csky, Guo Ren, guoren

On Mon, Aug 07, 2023 at 01:23:34AM -0400, Stefan O'Rear wrote:
> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > According to qspinlock requirements, RISC-V gives out a weak LR/SC
> > forward progress guarantee which does not satisfy qspinlock. But
> > many vendors could produce stronger forward guarantee LR/SC to
> > ensure the xchg_tail could be finished in time on any kind of
> > hart. T-HEAD is the vendor which implements strong forward
> > guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> > with errata help.
> >
> > T-HEAD early version of processors has the merge buffer delay
> > problem, so we need ERRATA_WRITEONCE to support qspinlock.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  arch/riscv/Kconfig.errata              | 13 +++++++++++++
> >  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
> >  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
> >  arch/riscv/include/asm/vendorid_list.h |  3 ++-
> >  arch/riscv/kernel/cpufeature.c         |  3 ++-
> >  5 files changed, 61 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> > index 4745a5c57e7c..eb43677b13cc 100644
> > --- a/arch/riscv/Kconfig.errata
> > +++ b/arch/riscv/Kconfig.errata
> > @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> > 
> >  	  If you don't know what to do here, say "Y".
> > 
> > +config ERRATA_THEAD_QSPINLOCK
> > +	bool "Apply T-Head queued spinlock errata"
> > +	depends on ERRATA_THEAD
> > +	default y
> > +	help
> > +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> > +	  match the xchg_tail requirement of qspinlock.
> > +
> > +	  This will apply the QSPINLOCK errata to handle the non-standard
> > +	  behavior via using qspinlock instead of ticket_lock.
> > +
> > +	  If you don't know what to do here, say "Y".
> 
> If this is to be applied, I would like to see a detailed explanation somewhere,
> preferably with citations, of:
> 
> (a) The memory model requirements for qspinlock
These were written in commit: a8ad07e5240 ("asm-generic: qspinlock: Indicate the use of
mixed-size atomics"). For riscv, the most controversial point is xchg_tail()
implementation for native queued spinlock.

> (b) Why, with arguments, RISC-V does not architecturally meet (a)
In the spec "Eventual Success of Store-Conditional Instructions":
"By contrast, if other harts or devices continue to write to that reservation set, it is
not guaranteed that any hart will exit its LR/SC loop."

1. The arch_spinlock_t is 32-bit width, and it contains LOCK_PENDING
   part and IDX_TAIL part.
    - LOCK:     lock holder
    - PENDING:  next waiter (Only once per contended situation)
    - IDX:      nested context (normal, hwirq, softirq, nmi)
    - TAIL:     last contended cpu
   The xchg_tail operate on IDX_TAIL part, so there is no guarantee on "NO"
   "other harts or devices continue to write to that reservation set".

2. When you do lock torture test, you may see a long contended ring queue:
                                                                xchg_tail
                                                                    +-----> CPU4 (big core)
                                                                    |
   CPU3 (lock holder) -> CPU1 (mcs queued) -> CPU2 (mcs queued) ----+-----> CPU0 (little core)
    |                                                               |
    |                                                               +-----> CPU5 (big core)
    |                                                               |
    +--locktorture release lock (spin_unlock) and spin_lock again --+-----> CPU3 (big core)

    If CPU0 doesn't have a strong fwd guarantee, xhg_tail is consistently failed.

> (c) Why, with arguments, T-HEAD C9xx meets (a)
> (d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
>     meets (a)
I can't give the C9xx microarch implementation detail. But many
open-source riscv cores have provided strong forward progress guarantee
LR/SC implementation [1] [2]. But I would say these implementations are
too rude, which makes LR send a cacheline unique interconnect request.
It satisfies xchg_tail but not cmpxchg & cond_load. CPU vendors should
carefully consider your LR/SC fwd guarantee implementation.

[1]: https://github.com/riscv-boom/riscv-boom/blob/v3.0.0/src/main/scala/lsu/dcache.scala#L650
[2]: https://github.com/OpenXiangShan/XiangShan/blob/v1.0/src/main/scala/xiangshan/cache/MainPipe.scala#L470

> 
> As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
> (livelock freedom but no starvation freedom) are exactly the same as those in
> Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
> constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
> disagree; I would like to see your perspective.
For Armv8, I would use LSE for the lock-contended scenario. Ref this
commit 0ea366f5e1b6: ("arm64: atomics: prefetch the destination word for
write prior to stxr").

> 
> -s
> 
> > +
> >  endmenu # "CPU errata selection"
> > diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> > index 881729746d2e..d560dc45c0e7 100644
> > --- a/arch/riscv/errata/thead/errata.c
> > +++ b/arch/riscv/errata/thead/errata.c
> > @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
> >  	return false;
> >  }
> > 
> > +static bool errata_probe_qspinlock(unsigned int stage,
> > +				   unsigned long arch_id, unsigned long impid)
> > +{
> > +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
> > +		return false;
> > +
> > +	/*
> > +	 * The queued_spinlock torture would get in livelock without
> > +	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
> > +	 * processors.
> > +	 */
> > +	if (arch_id == 0 && impid == 0 &&
> > +	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
> > +		return false;
> > +
> > +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> > +		return true;
> > +
> > +	return false;
> > +}
> > +
> >  static u32 thead_errata_probe(unsigned int stage,
> >  			      unsigned long archid, unsigned long impid)
> >  {
> > @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
> >  	if (errata_probe_write_once(stage, archid, impid))
> >  		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
> > 
> > +	if (errata_probe_qspinlock(stage, archid, impid))
> > +		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
> > +
> >  	return cpu_req_errata;
> >  }
> > 
> > diff --git a/arch/riscv/include/asm/errata_list.h 
> > b/arch/riscv/include/asm/errata_list.h
> > index fbb2b8d39321..a696d18d1b0d 100644
> > --- a/arch/riscv/include/asm/errata_list.h
> > +++ b/arch/riscv/include/asm/errata_list.h
> > @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
> >  	: "=r" (__ovl) :						\
> >  	: "memory")
> > 
> > +static __always_inline bool
> > +riscv_has_errata_thead_qspinlock(void)
> > +{
> > +	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> > +		asm_volatile_goto(
> > +		ALTERNATIVE(
> > +		"j	%l[l_no]", "nop",
> > +		THEAD_VENDOR_ID,
> > +		ERRATA_THEAD_QSPINLOCK,
> > +		CONFIG_ERRATA_THEAD_QSPINLOCK)
> > +		: : : : l_no);
> > +	} else {
> > +		goto l_no;
> > +	}
> > +
> > +	return true;
> > +l_no:
> > +	return false;
> > +}
> > +
> >  #endif /* __ASSEMBLY__ */
> > 
> >  #endif
> > diff --git a/arch/riscv/include/asm/vendorid_list.h 
> > b/arch/riscv/include/asm/vendorid_list.h
> > index 73078cfe4029..1f1d03877f5f 100644
> > --- a/arch/riscv/include/asm/vendorid_list.h
> > +++ b/arch/riscv/include/asm/vendorid_list.h
> > @@ -19,7 +19,8 @@
> >  #define	ERRATA_THEAD_CMO 1
> >  #define	ERRATA_THEAD_PMU 2
> >  #define	ERRATA_THEAD_WRITE_ONCE 3
> > -#define	ERRATA_THEAD_NUMBER 4
> > +#define	ERRATA_THEAD_QSPINLOCK 4
> > +#define	ERRATA_THEAD_NUMBER 5
> >  #endif
> > 
> >  #endif
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index f8dbbe1bbd34..d9694fe40a9a 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> >  		 * spinlock value, the only way is to change from queued_spinlock to
> >  		 * ticket_spinlock, but can not be vice.
> >  		 */
> > -		if (!force_qspinlock) {
> > +		if (!force_qspinlock &&
> > +		    !riscv_has_errata_thead_qspinlock()) {
> >  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> >  		}
> >  #endif
> > -- 
> > 2.36.1
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
  2023-08-02 16:46   ` guoren
  (?)
@ 2023-08-11 19:34     ` Waiman Long
  -1 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 19:34 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren


On 8/2/23 12:46, guoren@kernel.org wrote:
> 	\
> diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> new file mode 100644
> index 000000000000..c644a92d4548
> --- /dev/null
> +++ b/arch/riscv/include/asm/spinlock.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_RISCV_SPINLOCK_H
> +#define __ASM_RISCV_SPINLOCK_H
> +
> +#ifdef CONFIG_QUEUED_SPINLOCKS
> +#define _Q_PENDING_LOOPS	(1 << 9)
> +#endif
> +
> +#ifdef CONFIG_QUEUED_SPINLOCKS

You can merge the two "#ifdef CONFIG_QUEUED_SPINLOCKS" into single one 
to avoid the duplication.

Cheers,
Longman

> +#include <asm/qspinlock.h>
> +#include <asm/qrwlock.h>
> +#else
> +#include <asm-generic/spinlock.h>
> +#endif
> +
> +#endif /* __ASM_RISCV_SPINLOCK_H */


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
@ 2023-08-11 19:34     ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 19:34 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, Guo Ren, kvm, linux-doc, linux-csky, virtualization,
	linux-riscv


On 8/2/23 12:46, guoren@kernel.org wrote:
> 	\
> diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> new file mode 100644
> index 000000000000..c644a92d4548
> --- /dev/null
> +++ b/arch/riscv/include/asm/spinlock.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_RISCV_SPINLOCK_H
> +#define __ASM_RISCV_SPINLOCK_H
> +
> +#ifdef CONFIG_QUEUED_SPINLOCKS
> +#define _Q_PENDING_LOOPS	(1 << 9)
> +#endif
> +
> +#ifdef CONFIG_QUEUED_SPINLOCKS

You can merge the two "#ifdef CONFIG_QUEUED_SPINLOCKS" into single one 
to avoid the duplication.

Cheers,
Longman

> +#include <asm/qspinlock.h>
> +#include <asm/qrwlock.h>
> +#else
> +#include <asm-generic/spinlock.h>
> +#endif
> +
> +#endif /* __ASM_RISCV_SPINLOCK_H */

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
@ 2023-08-11 19:34     ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 19:34 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren


On 8/2/23 12:46, guoren@kernel.org wrote:
> 	\
> diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> new file mode 100644
> index 000000000000..c644a92d4548
> --- /dev/null
> +++ b/arch/riscv/include/asm/spinlock.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_RISCV_SPINLOCK_H
> +#define __ASM_RISCV_SPINLOCK_H
> +
> +#ifdef CONFIG_QUEUED_SPINLOCKS
> +#define _Q_PENDING_LOOPS	(1 << 9)
> +#endif
> +
> +#ifdef CONFIG_QUEUED_SPINLOCKS

You can merge the two "#ifdef CONFIG_QUEUED_SPINLOCKS" into single one 
to avoid the duplication.

Cheers,
Longman

> +#include <asm/qspinlock.h>
> +#include <asm/qrwlock.h>
> +#else
> +#include <asm-generic/spinlock.h>
> +#endif
> +
> +#endif /* __ASM_RISCV_SPINLOCK_H */


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
  2023-08-02 16:46   ` guoren
  (?)
@ 2023-08-11 19:51     ` Waiman Long
  -1 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 19:51 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

On 8/2/23 12:46, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Combo spinlock could support queued and ticket in one Linux Image and
> select them during boot time via errata mechanism. Here is the func
> size (Bytes) comparison table below:
>
> TYPE			: COMBO | TICKET | QUEUED
> arch_spin_lock		: 106	| 60     | 50
> arch_spin_unlock	: 54    | 36     | 26
> arch_spin_trylock	: 110   | 72     | 54
> arch_spin_is_locked	: 48    | 34     | 20
> arch_spin_is_contended	: 56    | 40     | 24
> rch_spin_value_unlocked	: 48    | 34     | 24
>
> One example of disassemble combo arch_spin_unlock:
>     0xffffffff8000409c <+14>:    nop                # detour slot
>     0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
>     0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
>     0xffffffff800040a8 <+26>:    ld      s0,8(sp)
>     0xffffffff800040aa <+28>:    addi    sp,sp,16
>     0xffffffff800040ac <+30>:    ret
>     0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
>     0xffffffff800040b0 <+34>:    sext.w  a5,a5
>     0xffffffff800040b2 <+36>:    fence   rw,w
>     0xffffffff800040b6 <+40>:    addiw   a5,a5,1
>     0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
>     0xffffffff800040ba <+44>:    srli    a5,a5,0x30
>     0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
>     0xffffffff800040c0 <+50>:    ld      s0,8(sp)
>     0xffffffff800040c2 <+52>:    addi    sp,sp,16
>     0xffffffff800040c4 <+54>:    ret
>
> The qspinlock is smaller and faster than ticket-lock when all are in
> fast-path, and combo spinlock could provide a compatible Linux Image
> for different micro-arch design (weak/strict fwd guarantee) processors.
>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> ---
>   arch/riscv/Kconfig                |  9 +++-
>   arch/riscv/include/asm/hwcap.h    |  1 +
>   arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
>   arch/riscv/kernel/cpufeature.c    | 10 ++++
>   4 files changed, 104 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index e89a3bea3dc1..119e774a3dcf 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -440,7 +440,7 @@ config NODES_SHIFT
>   
>   choice
>   	prompt "RISC-V spinlock type"
> -	default RISCV_TICKET_SPINLOCKS
> +	default RISCV_COMBO_SPINLOCKS
>   
>   config RISCV_TICKET_SPINLOCKS
>   	bool "Using ticket spinlock"
> @@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
>   	help
>   	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
>   	  Otherwise, stay at ticket-lock.
> +
> +config RISCV_COMBO_SPINLOCKS
> +	bool "Using combo spinlock"
> +	depends on SMP && MMU
> +	select ARCH_USE_QUEUED_SPINLOCKS
> +	help
> +	  Select queued spinlock or ticket-lock via errata.
>   endchoice
>   
>   config RISCV_ALTERNATIVE
> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> index f041bfa7f6a0..08ae75a694c2 100644
> --- a/arch/riscv/include/asm/hwcap.h
> +++ b/arch/riscv/include/asm/hwcap.h
> @@ -54,6 +54,7 @@
>   #define RISCV_ISA_EXT_ZIFENCEI		41
>   #define RISCV_ISA_EXT_ZIHPM		42
>   
> +#define RISCV_ISA_EXT_XTICKETLOCK	63
>   #define RISCV_ISA_EXT_MAX		64
>   #define RISCV_ISA_EXT_NAME_LEN_MAX	32
>   
> diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> index c644a92d4548..9eb3ad31e564 100644
> --- a/arch/riscv/include/asm/spinlock.h
> +++ b/arch/riscv/include/asm/spinlock.h
> @@ -7,11 +7,94 @@
>   #define _Q_PENDING_LOOPS	(1 << 9)
>   #endif
>   

I see why you separated the _Q_PENDING_LOOPS out.


> +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> +#include <asm-generic/ticket_spinlock.h>
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
> +
> +#include <asm-generic/qspinlock.h>
> +#include <asm/hwcap.h>
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
Perhaps you can add a macro like __no_arch_spinlock_redefine to disable 
the various arch_spin_* definition in qspinlock.h and ticket_spinlock.h.
> +
> +#define COMBO_DETOUR				\
> +	asm_volatile_goto(ALTERNATIVE(		\
> +		"nop",				\
> +		"j %l[ticket_spin_lock]",	\
> +		0,				\
> +		RISCV_ISA_EXT_XTICKETLOCK,	\
> +		CONFIG_RISCV_COMBO_SPINLOCKS)	\
> +		: : : : ticket_spin_lock);
> +
> +static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	queued_spin_lock(lock);
> +	return;
> +ticket_spin_lock:
> +	ticket_spin_lock(lock);
> +}
> +
> +static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_trylock(lock);
> +ticket_spin_lock:
> +	return ticket_spin_trylock(lock);
> +}
> +
> +static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	queued_spin_unlock(lock);
> +	return;
> +ticket_spin_lock:
> +	ticket_spin_unlock(lock);
> +}
> +
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_value_unlocked(lock);
> +ticket_spin_lock:
> +	return ticket_spin_value_unlocked(lock);
> +}
> +
> +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_is_locked(lock);
> +ticket_spin_lock:
> +	return ticket_spin_is_locked(lock);
> +}
> +
> +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_is_contended(lock);
> +ticket_spin_lock:
> +	return ticket_spin_is_contended(lock);
> +}
> +#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
> +
>   #ifdef CONFIG_QUEUED_SPINLOCKS
>   #include <asm/qspinlock.h>
> -#include <asm/qrwlock.h>
>   #else
> -#include <asm-generic/spinlock.h>
> +#include <asm-generic/ticket_spinlock.h>
>   #endif
>   
> +#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
> +
> +#include <asm/qrwlock.h>
> +
>   #endif /* __ASM_RISCV_SPINLOCK_H */
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index bdcf460ea53d..e65b0e54152d 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
>   		set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
>   		set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
>   
> +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> +		/*
> +		 * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
> +		 * as default or queued spinlock by choice. Because ticket_lock would dirty
> +		 * spinlock value, the only way is to change from queued_spinlock to
> +		 * ticket_spinlock, but can not be vice.

The phrase "but can not be vice" is confusing. I think you mean "but not 
vice versa". Right?

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
@ 2023-08-11 19:51     ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 19:51 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, Guo Ren, kvm, linux-doc, linux-csky, virtualization,
	linux-riscv

On 8/2/23 12:46, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Combo spinlock could support queued and ticket in one Linux Image and
> select them during boot time via errata mechanism. Here is the func
> size (Bytes) comparison table below:
>
> TYPE			: COMBO | TICKET | QUEUED
> arch_spin_lock		: 106	| 60     | 50
> arch_spin_unlock	: 54    | 36     | 26
> arch_spin_trylock	: 110   | 72     | 54
> arch_spin_is_locked	: 48    | 34     | 20
> arch_spin_is_contended	: 56    | 40     | 24
> rch_spin_value_unlocked	: 48    | 34     | 24
>
> One example of disassemble combo arch_spin_unlock:
>     0xffffffff8000409c <+14>:    nop                # detour slot
>     0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
>     0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
>     0xffffffff800040a8 <+26>:    ld      s0,8(sp)
>     0xffffffff800040aa <+28>:    addi    sp,sp,16
>     0xffffffff800040ac <+30>:    ret
>     0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
>     0xffffffff800040b0 <+34>:    sext.w  a5,a5
>     0xffffffff800040b2 <+36>:    fence   rw,w
>     0xffffffff800040b6 <+40>:    addiw   a5,a5,1
>     0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
>     0xffffffff800040ba <+44>:    srli    a5,a5,0x30
>     0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
>     0xffffffff800040c0 <+50>:    ld      s0,8(sp)
>     0xffffffff800040c2 <+52>:    addi    sp,sp,16
>     0xffffffff800040c4 <+54>:    ret
>
> The qspinlock is smaller and faster than ticket-lock when all are in
> fast-path, and combo spinlock could provide a compatible Linux Image
> for different micro-arch design (weak/strict fwd guarantee) processors.
>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> ---
>   arch/riscv/Kconfig                |  9 +++-
>   arch/riscv/include/asm/hwcap.h    |  1 +
>   arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
>   arch/riscv/kernel/cpufeature.c    | 10 ++++
>   4 files changed, 104 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index e89a3bea3dc1..119e774a3dcf 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -440,7 +440,7 @@ config NODES_SHIFT
>   
>   choice
>   	prompt "RISC-V spinlock type"
> -	default RISCV_TICKET_SPINLOCKS
> +	default RISCV_COMBO_SPINLOCKS
>   
>   config RISCV_TICKET_SPINLOCKS
>   	bool "Using ticket spinlock"
> @@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
>   	help
>   	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
>   	  Otherwise, stay at ticket-lock.
> +
> +config RISCV_COMBO_SPINLOCKS
> +	bool "Using combo spinlock"
> +	depends on SMP && MMU
> +	select ARCH_USE_QUEUED_SPINLOCKS
> +	help
> +	  Select queued spinlock or ticket-lock via errata.
>   endchoice
>   
>   config RISCV_ALTERNATIVE
> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> index f041bfa7f6a0..08ae75a694c2 100644
> --- a/arch/riscv/include/asm/hwcap.h
> +++ b/arch/riscv/include/asm/hwcap.h
> @@ -54,6 +54,7 @@
>   #define RISCV_ISA_EXT_ZIFENCEI		41
>   #define RISCV_ISA_EXT_ZIHPM		42
>   
> +#define RISCV_ISA_EXT_XTICKETLOCK	63
>   #define RISCV_ISA_EXT_MAX		64
>   #define RISCV_ISA_EXT_NAME_LEN_MAX	32
>   
> diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> index c644a92d4548..9eb3ad31e564 100644
> --- a/arch/riscv/include/asm/spinlock.h
> +++ b/arch/riscv/include/asm/spinlock.h
> @@ -7,11 +7,94 @@
>   #define _Q_PENDING_LOOPS	(1 << 9)
>   #endif
>   

I see why you separated the _Q_PENDING_LOOPS out.


> +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> +#include <asm-generic/ticket_spinlock.h>
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
> +
> +#include <asm-generic/qspinlock.h>
> +#include <asm/hwcap.h>
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
Perhaps you can add a macro like __no_arch_spinlock_redefine to disable 
the various arch_spin_* definition in qspinlock.h and ticket_spinlock.h.
> +
> +#define COMBO_DETOUR				\
> +	asm_volatile_goto(ALTERNATIVE(		\
> +		"nop",				\
> +		"j %l[ticket_spin_lock]",	\
> +		0,				\
> +		RISCV_ISA_EXT_XTICKETLOCK,	\
> +		CONFIG_RISCV_COMBO_SPINLOCKS)	\
> +		: : : : ticket_spin_lock);
> +
> +static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	queued_spin_lock(lock);
> +	return;
> +ticket_spin_lock:
> +	ticket_spin_lock(lock);
> +}
> +
> +static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_trylock(lock);
> +ticket_spin_lock:
> +	return ticket_spin_trylock(lock);
> +}
> +
> +static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	queued_spin_unlock(lock);
> +	return;
> +ticket_spin_lock:
> +	ticket_spin_unlock(lock);
> +}
> +
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_value_unlocked(lock);
> +ticket_spin_lock:
> +	return ticket_spin_value_unlocked(lock);
> +}
> +
> +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_is_locked(lock);
> +ticket_spin_lock:
> +	return ticket_spin_is_locked(lock);
> +}
> +
> +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_is_contended(lock);
> +ticket_spin_lock:
> +	return ticket_spin_is_contended(lock);
> +}
> +#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
> +
>   #ifdef CONFIG_QUEUED_SPINLOCKS
>   #include <asm/qspinlock.h>
> -#include <asm/qrwlock.h>
>   #else
> -#include <asm-generic/spinlock.h>
> +#include <asm-generic/ticket_spinlock.h>
>   #endif
>   
> +#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
> +
> +#include <asm/qrwlock.h>
> +
>   #endif /* __ASM_RISCV_SPINLOCK_H */
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index bdcf460ea53d..e65b0e54152d 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
>   		set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
>   		set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
>   
> +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> +		/*
> +		 * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
> +		 * as default or queued spinlock by choice. Because ticket_lock would dirty
> +		 * spinlock value, the only way is to change from queued_spinlock to
> +		 * ticket_spinlock, but can not be vice.

The phrase "but can not be vice" is confusing. I think you mean "but not 
vice versa". Right?

Cheers,
Longman

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
@ 2023-08-11 19:51     ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 19:51 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

On 8/2/23 12:46, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> Combo spinlock could support queued and ticket in one Linux Image and
> select them during boot time via errata mechanism. Here is the func
> size (Bytes) comparison table below:
>
> TYPE			: COMBO | TICKET | QUEUED
> arch_spin_lock		: 106	| 60     | 50
> arch_spin_unlock	: 54    | 36     | 26
> arch_spin_trylock	: 110   | 72     | 54
> arch_spin_is_locked	: 48    | 34     | 20
> arch_spin_is_contended	: 56    | 40     | 24
> rch_spin_value_unlocked	: 48    | 34     | 24
>
> One example of disassemble combo arch_spin_unlock:
>     0xffffffff8000409c <+14>:    nop                # detour slot
>     0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
>     0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
>     0xffffffff800040a8 <+26>:    ld      s0,8(sp)
>     0xffffffff800040aa <+28>:    addi    sp,sp,16
>     0xffffffff800040ac <+30>:    ret
>     0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
>     0xffffffff800040b0 <+34>:    sext.w  a5,a5
>     0xffffffff800040b2 <+36>:    fence   rw,w
>     0xffffffff800040b6 <+40>:    addiw   a5,a5,1
>     0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
>     0xffffffff800040ba <+44>:    srli    a5,a5,0x30
>     0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
>     0xffffffff800040c0 <+50>:    ld      s0,8(sp)
>     0xffffffff800040c2 <+52>:    addi    sp,sp,16
>     0xffffffff800040c4 <+54>:    ret
>
> The qspinlock is smaller and faster than ticket-lock when all are in
> fast-path, and combo spinlock could provide a compatible Linux Image
> for different micro-arch design (weak/strict fwd guarantee) processors.
>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> ---
>   arch/riscv/Kconfig                |  9 +++-
>   arch/riscv/include/asm/hwcap.h    |  1 +
>   arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
>   arch/riscv/kernel/cpufeature.c    | 10 ++++
>   4 files changed, 104 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index e89a3bea3dc1..119e774a3dcf 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -440,7 +440,7 @@ config NODES_SHIFT
>   
>   choice
>   	prompt "RISC-V spinlock type"
> -	default RISCV_TICKET_SPINLOCKS
> +	default RISCV_COMBO_SPINLOCKS
>   
>   config RISCV_TICKET_SPINLOCKS
>   	bool "Using ticket spinlock"
> @@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
>   	help
>   	  Make sure your micro arch LL/SC has a strong forward progress guarantee.
>   	  Otherwise, stay at ticket-lock.
> +
> +config RISCV_COMBO_SPINLOCKS
> +	bool "Using combo spinlock"
> +	depends on SMP && MMU
> +	select ARCH_USE_QUEUED_SPINLOCKS
> +	help
> +	  Select queued spinlock or ticket-lock via errata.
>   endchoice
>   
>   config RISCV_ALTERNATIVE
> diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> index f041bfa7f6a0..08ae75a694c2 100644
> --- a/arch/riscv/include/asm/hwcap.h
> +++ b/arch/riscv/include/asm/hwcap.h
> @@ -54,6 +54,7 @@
>   #define RISCV_ISA_EXT_ZIFENCEI		41
>   #define RISCV_ISA_EXT_ZIHPM		42
>   
> +#define RISCV_ISA_EXT_XTICKETLOCK	63
>   #define RISCV_ISA_EXT_MAX		64
>   #define RISCV_ISA_EXT_NAME_LEN_MAX	32
>   
> diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> index c644a92d4548..9eb3ad31e564 100644
> --- a/arch/riscv/include/asm/spinlock.h
> +++ b/arch/riscv/include/asm/spinlock.h
> @@ -7,11 +7,94 @@
>   #define _Q_PENDING_LOOPS	(1 << 9)
>   #endif
>   

I see why you separated the _Q_PENDING_LOOPS out.


> +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> +#include <asm-generic/ticket_spinlock.h>
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
> +
> +#include <asm-generic/qspinlock.h>
> +#include <asm/hwcap.h>
> +
> +#undef arch_spin_is_locked
> +#undef arch_spin_is_contended
> +#undef arch_spin_value_unlocked
> +#undef arch_spin_lock
> +#undef arch_spin_trylock
> +#undef arch_spin_unlock
Perhaps you can add a macro like __no_arch_spinlock_redefine to disable 
the various arch_spin_* definition in qspinlock.h and ticket_spinlock.h.
> +
> +#define COMBO_DETOUR				\
> +	asm_volatile_goto(ALTERNATIVE(		\
> +		"nop",				\
> +		"j %l[ticket_spin_lock]",	\
> +		0,				\
> +		RISCV_ISA_EXT_XTICKETLOCK,	\
> +		CONFIG_RISCV_COMBO_SPINLOCKS)	\
> +		: : : : ticket_spin_lock);
> +
> +static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	queued_spin_lock(lock);
> +	return;
> +ticket_spin_lock:
> +	ticket_spin_lock(lock);
> +}
> +
> +static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_trylock(lock);
> +ticket_spin_lock:
> +	return ticket_spin_trylock(lock);
> +}
> +
> +static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	queued_spin_unlock(lock);
> +	return;
> +ticket_spin_lock:
> +	ticket_spin_unlock(lock);
> +}
> +
> +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_value_unlocked(lock);
> +ticket_spin_lock:
> +	return ticket_spin_value_unlocked(lock);
> +}
> +
> +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_is_locked(lock);
> +ticket_spin_lock:
> +	return ticket_spin_is_locked(lock);
> +}
> +
> +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> +{
> +	COMBO_DETOUR
> +	return queued_spin_is_contended(lock);
> +ticket_spin_lock:
> +	return ticket_spin_is_contended(lock);
> +}
> +#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
> +
>   #ifdef CONFIG_QUEUED_SPINLOCKS
>   #include <asm/qspinlock.h>
> -#include <asm/qrwlock.h>
>   #else
> -#include <asm-generic/spinlock.h>
> +#include <asm-generic/ticket_spinlock.h>
>   #endif
>   
> +#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
> +
> +#include <asm/qrwlock.h>
> +
>   #endif /* __ASM_RISCV_SPINLOCK_H */
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index bdcf460ea53d..e65b0e54152d 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
>   		set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
>   		set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
>   
> +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> +		/*
> +		 * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
> +		 * as default or queued spinlock by choice. Because ticket_lock would dirty
> +		 * spinlock value, the only way is to change from queued_spinlock to
> +		 * ticket_spinlock, but can not be vice.

The phrase "but can not be vice" is confusing. I think you mean "but not 
vice versa". Right?

Cheers,
Longman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
  2023-08-02 16:47   ` guoren
  (?)
@ 2023-08-11 20:42     ` Waiman Long
  -1 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 20:42 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

On 8/2/23 12:47, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> The pv_ops belongs to x86 custom infrastructure and cleans up the
> cna_configure_spin_lock_slowpath() with standard code. This is
> preparation for riscv support CNA qspoinlock.

CNA qspinlock has not been merged into mainline yet. I will suggest you 
drop the last 2 patches for now. Of course, you can provide benchmark 
data to boost the case for the inclusion of the CNA qspinlock patch into 
the mainline.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
@ 2023-08-11 20:42     ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 20:42 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, linux-riscv, linux-doc, kvm, virtualization,
	linux-csky, Guo Ren

On 8/2/23 12:47, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> The pv_ops belongs to x86 custom infrastructure and cleans up the
> cna_configure_spin_lock_slowpath() with standard code. This is
> preparation for riscv support CNA qspoinlock.

CNA qspinlock has not been merged into mainline yet. I will suggest you 
drop the last 2 patches for now. Of course, you can provide benchmark 
data to boost the case for the inclusion of the CNA qspinlock patch into 
the mainline.

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
@ 2023-08-11 20:42     ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-11 20:42 UTC (permalink / raw)
  To: guoren, paul.walmsley, anup, peterz, mingo, will, palmer,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, catalin.marinas,
	conor.dooley, xiaoguang.xing, bjorn, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016
  Cc: linux-arch, Guo Ren, kvm, linux-doc, linux-csky, virtualization,
	linux-riscv

On 8/2/23 12:47, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
>
> The pv_ops belongs to x86 custom infrastructure and cleans up the
> cna_configure_spin_lock_slowpath() with standard code. This is
> preparation for riscv support CNA qspoinlock.

CNA qspinlock has not been merged into mainline yet. I will suggest you 
drop the last 2 patches for now. Of course, you can provide benchmark 
data to boost the case for the inclusion of the CNA qspinlock patch into 
the mainline.

Cheers,
Longman

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
  2023-08-11 19:34     ` Waiman Long
@ 2023-08-12  0:18       ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-12  0:18 UTC (permalink / raw)
  To: Waiman Long
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Sat, Aug 12, 2023 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
>
>
> On 8/2/23 12:46, guoren@kernel.org wrote:
> >       \
> > diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> > new file mode 100644
> > index 000000000000..c644a92d4548
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/spinlock.h
> > @@ -0,0 +1,17 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef __ASM_RISCV_SPINLOCK_H
> > +#define __ASM_RISCV_SPINLOCK_H
> > +
> > +#ifdef CONFIG_QUEUED_SPINLOCKS
> > +#define _Q_PENDING_LOOPS     (1 << 9)
> > +#endif
> > +
> > +#ifdef CONFIG_QUEUED_SPINLOCKS
>
> You can merge the two "#ifdef CONFIG_QUEUED_SPINLOCKS" into single one
> to avoid the duplication.
Okay.

>
> Cheers,
> Longman
>
> > +#include <asm/qspinlock.h>
> > +#include <asm/qrwlock.h>
> > +#else
> > +#include <asm-generic/spinlock.h>
> > +#endif
> > +
> > +#endif /* __ASM_RISCV_SPINLOCK_H */
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support
@ 2023-08-12  0:18       ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-12  0:18 UTC (permalink / raw)
  To: Waiman Long
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Sat, Aug 12, 2023 at 3:34 AM Waiman Long <longman@redhat.com> wrote:
>
>
> On 8/2/23 12:46, guoren@kernel.org wrote:
> >       \
> > diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> > new file mode 100644
> > index 000000000000..c644a92d4548
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/spinlock.h
> > @@ -0,0 +1,17 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef __ASM_RISCV_SPINLOCK_H
> > +#define __ASM_RISCV_SPINLOCK_H
> > +
> > +#ifdef CONFIG_QUEUED_SPINLOCKS
> > +#define _Q_PENDING_LOOPS     (1 << 9)
> > +#endif
> > +
> > +#ifdef CONFIG_QUEUED_SPINLOCKS
>
> You can merge the two "#ifdef CONFIG_QUEUED_SPINLOCKS" into single one
> to avoid the duplication.
Okay.

>
> Cheers,
> Longman
>
> > +#include <asm/qspinlock.h>
> > +#include <asm/qrwlock.h>
> > +#else
> > +#include <asm-generic/spinlock.h>
> > +#endif
> > +
> > +#endif /* __ASM_RISCV_SPINLOCK_H */
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
  2023-08-11 19:51     ` Waiman Long
@ 2023-08-12  0:22       ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-12  0:22 UTC (permalink / raw)
  To: Waiman Long
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Sat, Aug 12, 2023 at 3:51 AM Waiman Long <longman@redhat.com> wrote:
>
> On 8/2/23 12:46, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Combo spinlock could support queued and ticket in one Linux Image and
> > select them during boot time via errata mechanism. Here is the func
> > size (Bytes) comparison table below:
> >
> > TYPE                  : COMBO | TICKET | QUEUED
> > arch_spin_lock                : 106   | 60     | 50
> > arch_spin_unlock      : 54    | 36     | 26
> > arch_spin_trylock     : 110   | 72     | 54
> > arch_spin_is_locked   : 48    | 34     | 20
> > arch_spin_is_contended        : 56    | 40     | 24
> > rch_spin_value_unlocked       : 48    | 34     | 24
> >
> > One example of disassemble combo arch_spin_unlock:
> >     0xffffffff8000409c <+14>:    nop                # detour slot
> >     0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
> >     0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
> >     0xffffffff800040a8 <+26>:    ld      s0,8(sp)
> >     0xffffffff800040aa <+28>:    addi    sp,sp,16
> >     0xffffffff800040ac <+30>:    ret
> >     0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
> >     0xffffffff800040b0 <+34>:    sext.w  a5,a5
> >     0xffffffff800040b2 <+36>:    fence   rw,w
> >     0xffffffff800040b6 <+40>:    addiw   a5,a5,1
> >     0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
> >     0xffffffff800040ba <+44>:    srli    a5,a5,0x30
> >     0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
> >     0xffffffff800040c0 <+50>:    ld      s0,8(sp)
> >     0xffffffff800040c2 <+52>:    addi    sp,sp,16
> >     0xffffffff800040c4 <+54>:    ret
> >
> > The qspinlock is smaller and faster than ticket-lock when all are in
> > fast-path, and combo spinlock could provide a compatible Linux Image
> > for different micro-arch design (weak/strict fwd guarantee) processors.
> >
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > ---
> >   arch/riscv/Kconfig                |  9 +++-
> >   arch/riscv/include/asm/hwcap.h    |  1 +
> >   arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
> >   arch/riscv/kernel/cpufeature.c    | 10 ++++
> >   4 files changed, 104 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index e89a3bea3dc1..119e774a3dcf 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -440,7 +440,7 @@ config NODES_SHIFT
> >
> >   choice
> >       prompt "RISC-V spinlock type"
> > -     default RISCV_TICKET_SPINLOCKS
> > +     default RISCV_COMBO_SPINLOCKS
> >
> >   config RISCV_TICKET_SPINLOCKS
> >       bool "Using ticket spinlock"
> > @@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
> >       help
> >         Make sure your micro arch LL/SC has a strong forward progress guarantee.
> >         Otherwise, stay at ticket-lock.
> > +
> > +config RISCV_COMBO_SPINLOCKS
> > +     bool "Using combo spinlock"
> > +     depends on SMP && MMU
> > +     select ARCH_USE_QUEUED_SPINLOCKS
> > +     help
> > +       Select queued spinlock or ticket-lock via errata.
> >   endchoice
> >
> >   config RISCV_ALTERNATIVE
> > diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> > index f041bfa7f6a0..08ae75a694c2 100644
> > --- a/arch/riscv/include/asm/hwcap.h
> > +++ b/arch/riscv/include/asm/hwcap.h
> > @@ -54,6 +54,7 @@
> >   #define RISCV_ISA_EXT_ZIFENCEI              41
> >   #define RISCV_ISA_EXT_ZIHPM         42
> >
> > +#define RISCV_ISA_EXT_XTICKETLOCK    63
> >   #define RISCV_ISA_EXT_MAX           64
> >   #define RISCV_ISA_EXT_NAME_LEN_MAX  32
> >
> > diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> > index c644a92d4548..9eb3ad31e564 100644
> > --- a/arch/riscv/include/asm/spinlock.h
> > +++ b/arch/riscv/include/asm/spinlock.h
> > @@ -7,11 +7,94 @@
> >   #define _Q_PENDING_LOOPS    (1 << 9)
> >   #endif
> >
>
> I see why you separated the _Q_PENDING_LOOPS out.
haha, yes, I even forget this, :).

>
>
> > +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> > +#include <asm-generic/ticket_spinlock.h>
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> > +
> > +#include <asm-generic/qspinlock.h>
> > +#include <asm/hwcap.h>
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> Perhaps you can add a macro like __no_arch_spinlock_redefine to disable
> the various arch_spin_* definition in qspinlock.h and ticket_spinlock.h.
That's great; I will try it in the next version.

> > +
> > +#define COMBO_DETOUR                         \
> > +     asm_volatile_goto(ALTERNATIVE(          \
> > +             "nop",                          \
> > +             "j %l[ticket_spin_lock]",       \
> > +             0,                              \
> > +             RISCV_ISA_EXT_XTICKETLOCK,      \
> > +             CONFIG_RISCV_COMBO_SPINLOCKS)   \
> > +             : : : : ticket_spin_lock);
> > +
> > +static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     queued_spin_lock(lock);
> > +     return;
> > +ticket_spin_lock:
> > +     ticket_spin_lock(lock);
> > +}
> > +
> > +static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_trylock(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_trylock(lock);
> > +}
> > +
> > +static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     queued_spin_unlock(lock);
> > +     return;
> > +ticket_spin_lock:
> > +     ticket_spin_unlock(lock);
> > +}
> > +
> > +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_value_unlocked(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_value_unlocked(lock);
> > +}
> > +
> > +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_is_locked(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_is_locked(lock);
> > +}
> > +
> > +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_is_contended(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_is_contended(lock);
> > +}
> > +#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
> > +
> >   #ifdef CONFIG_QUEUED_SPINLOCKS
> >   #include <asm/qspinlock.h>
> > -#include <asm/qrwlock.h>
> >   #else
> > -#include <asm-generic/spinlock.h>
> > +#include <asm-generic/ticket_spinlock.h>
> >   #endif
> >
> > +#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
> > +
> > +#include <asm/qrwlock.h>
> > +
> >   #endif /* __ASM_RISCV_SPINLOCK_H */
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index bdcf460ea53d..e65b0e54152d 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
> >               set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
> >               set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
> >
> > +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> > +             /*
> > +              * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
> > +              * as default or queued spinlock by choice. Because ticket_lock would dirty
> > +              * spinlock value, the only way is to change from queued_spinlock to
> > +              * ticket_spinlock, but can not be vice.
>
> The phrase "but can not be vice" is confusing. I think you mean "but not
> vice versa". Right?
Yes, thx for the grammar correction.

>
> Cheers,
> Longman
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock
@ 2023-08-12  0:22       ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-12  0:22 UTC (permalink / raw)
  To: Waiman Long
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Sat, Aug 12, 2023 at 3:51 AM Waiman Long <longman@redhat.com> wrote:
>
> On 8/2/23 12:46, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > Combo spinlock could support queued and ticket in one Linux Image and
> > select them during boot time via errata mechanism. Here is the func
> > size (Bytes) comparison table below:
> >
> > TYPE                  : COMBO | TICKET | QUEUED
> > arch_spin_lock                : 106   | 60     | 50
> > arch_spin_unlock      : 54    | 36     | 26
> > arch_spin_trylock     : 110   | 72     | 54
> > arch_spin_is_locked   : 48    | 34     | 20
> > arch_spin_is_contended        : 56    | 40     | 24
> > rch_spin_value_unlocked       : 48    | 34     | 24
> >
> > One example of disassemble combo arch_spin_unlock:
> >     0xffffffff8000409c <+14>:    nop                # detour slot
> >     0xffffffff800040a0 <+18>:    fence   rw,w       # queued spinlock start
> >     0xffffffff800040a4 <+22>:    sb      zero,0(a4) # queued spinlock end
> >     0xffffffff800040a8 <+26>:    ld      s0,8(sp)
> >     0xffffffff800040aa <+28>:    addi    sp,sp,16
> >     0xffffffff800040ac <+30>:    ret
> >     0xffffffff800040ae <+32>:    lw      a5,0(a4)   # ticket spinlock start
> >     0xffffffff800040b0 <+34>:    sext.w  a5,a5
> >     0xffffffff800040b2 <+36>:    fence   rw,w
> >     0xffffffff800040b6 <+40>:    addiw   a5,a5,1
> >     0xffffffff800040b8 <+42>:    slli    a5,a5,0x30
> >     0xffffffff800040ba <+44>:    srli    a5,a5,0x30
> >     0xffffffff800040bc <+46>:    sh      a5,0(a4)   # ticket spinlock end
> >     0xffffffff800040c0 <+50>:    ld      s0,8(sp)
> >     0xffffffff800040c2 <+52>:    addi    sp,sp,16
> >     0xffffffff800040c4 <+54>:    ret
> >
> > The qspinlock is smaller and faster than ticket-lock when all are in
> > fast-path, and combo spinlock could provide a compatible Linux Image
> > for different micro-arch design (weak/strict fwd guarantee) processors.
> >
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > ---
> >   arch/riscv/Kconfig                |  9 +++-
> >   arch/riscv/include/asm/hwcap.h    |  1 +
> >   arch/riscv/include/asm/spinlock.h | 87 ++++++++++++++++++++++++++++++-
> >   arch/riscv/kernel/cpufeature.c    | 10 ++++
> >   4 files changed, 104 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index e89a3bea3dc1..119e774a3dcf 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -440,7 +440,7 @@ config NODES_SHIFT
> >
> >   choice
> >       prompt "RISC-V spinlock type"
> > -     default RISCV_TICKET_SPINLOCKS
> > +     default RISCV_COMBO_SPINLOCKS
> >
> >   config RISCV_TICKET_SPINLOCKS
> >       bool "Using ticket spinlock"
> > @@ -452,6 +452,13 @@ config RISCV_QUEUED_SPINLOCKS
> >       help
> >         Make sure your micro arch LL/SC has a strong forward progress guarantee.
> >         Otherwise, stay at ticket-lock.
> > +
> > +config RISCV_COMBO_SPINLOCKS
> > +     bool "Using combo spinlock"
> > +     depends on SMP && MMU
> > +     select ARCH_USE_QUEUED_SPINLOCKS
> > +     help
> > +       Select queued spinlock or ticket-lock via errata.
> >   endchoice
> >
> >   config RISCV_ALTERNATIVE
> > diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
> > index f041bfa7f6a0..08ae75a694c2 100644
> > --- a/arch/riscv/include/asm/hwcap.h
> > +++ b/arch/riscv/include/asm/hwcap.h
> > @@ -54,6 +54,7 @@
> >   #define RISCV_ISA_EXT_ZIFENCEI              41
> >   #define RISCV_ISA_EXT_ZIHPM         42
> >
> > +#define RISCV_ISA_EXT_XTICKETLOCK    63
> >   #define RISCV_ISA_EXT_MAX           64
> >   #define RISCV_ISA_EXT_NAME_LEN_MAX  32
> >
> > diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
> > index c644a92d4548..9eb3ad31e564 100644
> > --- a/arch/riscv/include/asm/spinlock.h
> > +++ b/arch/riscv/include/asm/spinlock.h
> > @@ -7,11 +7,94 @@
> >   #define _Q_PENDING_LOOPS    (1 << 9)
> >   #endif
> >
>
> I see why you separated the _Q_PENDING_LOOPS out.
haha, yes, I even forget this, :).

>
>
> > +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> > +#include <asm-generic/ticket_spinlock.h>
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> > +
> > +#include <asm-generic/qspinlock.h>
> > +#include <asm/hwcap.h>
> > +
> > +#undef arch_spin_is_locked
> > +#undef arch_spin_is_contended
> > +#undef arch_spin_value_unlocked
> > +#undef arch_spin_lock
> > +#undef arch_spin_trylock
> > +#undef arch_spin_unlock
> Perhaps you can add a macro like __no_arch_spinlock_redefine to disable
> the various arch_spin_* definition in qspinlock.h and ticket_spinlock.h.
That's great; I will try it in the next version.

> > +
> > +#define COMBO_DETOUR                         \
> > +     asm_volatile_goto(ALTERNATIVE(          \
> > +             "nop",                          \
> > +             "j %l[ticket_spin_lock]",       \
> > +             0,                              \
> > +             RISCV_ISA_EXT_XTICKETLOCK,      \
> > +             CONFIG_RISCV_COMBO_SPINLOCKS)   \
> > +             : : : : ticket_spin_lock);
> > +
> > +static __always_inline void arch_spin_lock(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     queued_spin_lock(lock);
> > +     return;
> > +ticket_spin_lock:
> > +     ticket_spin_lock(lock);
> > +}
> > +
> > +static __always_inline bool arch_spin_trylock(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_trylock(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_trylock(lock);
> > +}
> > +
> > +static __always_inline void arch_spin_unlock(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     queued_spin_unlock(lock);
> > +     return;
> > +ticket_spin_lock:
> > +     ticket_spin_unlock(lock);
> > +}
> > +
> > +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_value_unlocked(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_value_unlocked(lock);
> > +}
> > +
> > +static __always_inline int arch_spin_is_locked(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_is_locked(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_is_locked(lock);
> > +}
> > +
> > +static __always_inline int arch_spin_is_contended(arch_spinlock_t *lock)
> > +{
> > +     COMBO_DETOUR
> > +     return queued_spin_is_contended(lock);
> > +ticket_spin_lock:
> > +     return ticket_spin_is_contended(lock);
> > +}
> > +#else /* CONFIG_RISCV_COMBO_SPINLOCKS */
> > +
> >   #ifdef CONFIG_QUEUED_SPINLOCKS
> >   #include <asm/qspinlock.h>
> > -#include <asm/qrwlock.h>
> >   #else
> > -#include <asm-generic/spinlock.h>
> > +#include <asm-generic/ticket_spinlock.h>
> >   #endif
> >
> > +#endif /* CONFIG_RISCV_COMBO_SPINLOCKS */
> > +
> > +#include <asm/qrwlock.h>
> > +
> >   #endif /* __ASM_RISCV_SPINLOCK_H */
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index bdcf460ea53d..e65b0e54152d 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -324,6 +324,16 @@ void __init riscv_fill_hwcap(void)
> >               set_bit(RISCV_ISA_EXT_ZICSR, isainfo->isa);
> >               set_bit(RISCV_ISA_EXT_ZIFENCEI, isainfo->isa);
> >
> > +#ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> > +             /*
> > +              * The RISC-V Linux used queued spinlock at first; then, we used ticket_lock
> > +              * as default or queued spinlock by choice. Because ticket_lock would dirty
> > +              * spinlock value, the only way is to change from queued_spinlock to
> > +              * ticket_spinlock, but can not be vice.
>
> The phrase "but can not be vice" is confusing. I think you mean "but not
> vice versa". Right?
Yes, thx for the grammar correction.

>
> Cheers,
> Longman
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
  2023-08-11 20:42     ` Waiman Long
@ 2023-08-12  0:24       ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-12  0:24 UTC (permalink / raw)
  To: Waiman Long
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Sat, Aug 12, 2023 at 4:42 AM Waiman Long <longman@redhat.com> wrote:
>
> On 8/2/23 12:47, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > The pv_ops belongs to x86 custom infrastructure and cleans up the
> > cna_configure_spin_lock_slowpath() with standard code. This is
> > preparation for riscv support CNA qspoinlock.
>
> CNA qspinlock has not been merged into mainline yet. I will suggest you
> drop the last 2 patches for now. Of course, you can provide benchmark
> data to boost the case for the inclusion of the CNA qspinlock patch into
> the mainline.
Yes, my lazy, I would separate paravirt and CNA from this series.

>
> Cheers,
> Longman
>


-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
@ 2023-08-12  0:24       ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-08-12  0:24 UTC (permalink / raw)
  To: Waiman Long
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren

On Sat, Aug 12, 2023 at 4:42 AM Waiman Long <longman@redhat.com> wrote:
>
> On 8/2/23 12:47, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > The pv_ops belongs to x86 custom infrastructure and cleans up the
> > cna_configure_spin_lock_slowpath() with standard code. This is
> > preparation for riscv support CNA qspoinlock.
>
> CNA qspinlock has not been merged into mainline yet. I will suggest you
> drop the last 2 patches for now. Of course, you can provide benchmark
> data to boost the case for the inclusion of the CNA qspinlock patch into
> the mainline.
Yes, my lazy, I would separate paravirt and CNA from this series.

>
> Cheers,
> Longman
>


-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
  2023-08-12  0:24       ` Guo Ren
  (?)
@ 2023-08-12  0:47         ` Waiman Long
  -1 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-12  0:47 UTC (permalink / raw)
  To: Guo Ren
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren


On 8/11/23 20:24, Guo Ren wrote:
> On Sat, Aug 12, 2023 at 4:42 AM Waiman Long <longman@redhat.com> wrote:
>> On 8/2/23 12:47, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> The pv_ops belongs to x86 custom infrastructure and cleans up the
>>> cna_configure_spin_lock_slowpath() with standard code. This is
>>> preparation for riscv support CNA qspoinlock.
>> CNA qspinlock has not been merged into mainline yet. I will suggest you
>> drop the last 2 patches for now. Of course, you can provide benchmark
>> data to boost the case for the inclusion of the CNA qspinlock patch into
>> the mainline.
> Yes, my lazy, I would separate paravirt and CNA from this series.

paravirt is OK, it is just that CNA hasn't been merged yet.

Cheers,
Longman

>
>> Cheers,
>> Longman
>>
>


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
@ 2023-08-12  0:47         ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-12  0:47 UTC (permalink / raw)
  To: Guo Ren
  Cc: paul.walmsley, anup, peterz, mingo, will, palmer, boqun.feng,
	tglx, paulmck, rostedt, rdunlap, catalin.marinas, conor.dooley,
	xiaoguang.xing, bjorn, alexghiti, keescook, greentime.hu, ajones,
	jszhang, wefu, wuwei2016, linux-arch, linux-riscv, linux-doc,
	kvm, virtualization, linux-csky, Guo Ren


On 8/11/23 20:24, Guo Ren wrote:
> On Sat, Aug 12, 2023 at 4:42 AM Waiman Long <longman@redhat.com> wrote:
>> On 8/2/23 12:47, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> The pv_ops belongs to x86 custom infrastructure and cleans up the
>>> cna_configure_spin_lock_slowpath() with standard code. This is
>>> preparation for riscv support CNA qspoinlock.
>> CNA qspinlock has not been merged into mainline yet. I will suggest you
>> drop the last 2 patches for now. Of course, you can provide benchmark
>> data to boost the case for the inclusion of the CNA qspinlock patch into
>> the mainline.
> Yes, my lazy, I would separate paravirt and CNA from this series.

paravirt is OK, it is just that CNA hasn't been merged yet.

Cheers,
Longman

>
>> Cheers,
>> Longman
>>
>


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory
@ 2023-08-12  0:47         ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-08-12  0:47 UTC (permalink / raw)
  To: Guo Ren
  Cc: Guo Ren, kvm, linux-doc, peterz, catalin.marinas, bjorn, palmer,
	virtualization, conor.dooley, jszhang, linux-riscv, will,
	keescook, linux-arch, anup, linux-csky, xiaoguang.xing, mingo,
	greentime.hu, ajones, alexghiti, paulmck, boqun.feng, rostedt,
	paul.walmsley, tglx, rdunlap, wuwei2016, wefu


On 8/11/23 20:24, Guo Ren wrote:
> On Sat, Aug 12, 2023 at 4:42 AM Waiman Long <longman@redhat.com> wrote:
>> On 8/2/23 12:47, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> The pv_ops belongs to x86 custom infrastructure and cleans up the
>>> cna_configure_spin_lock_slowpath() with standard code. This is
>>> preparation for riscv support CNA qspoinlock.
>> CNA qspinlock has not been merged into mainline yet. I will suggest you
>> drop the last 2 patches for now. Of course, you can provide benchmark
>> data to boost the case for the inclusion of the CNA qspinlock patch into
>> the mainline.
> Yes, my lazy, I would separate paravirt and CNA from this series.

paravirt is OK, it is just that CNA hasn't been merged yet.

Cheers,
Longman

>
>> Cheers,
>> Longman
>>
>

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-08-07  5:23     ` Stefan O'Rear
@ 2023-09-13 18:54       ` Palmer Dabbelt
  -1 siblings, 0 replies; 77+ messages in thread
From: Palmer Dabbelt @ 2023-09-13 18:54 UTC (permalink / raw)
  To: sorear
  Cc: guoren, Paul Walmsley, anup, peterz, mingo, Will Deacon, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, Catalin Marinas,
	Conor Dooley, xiaoguang.xing, Bjorn Topel, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016, linux-arch,
	linux-riscv, linux-doc, kvm, virtualization, linux-csky, guoren

On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
>> From: Guo Ren <guoren@linux.alibaba.com>
>>
>> According to qspinlock requirements, RISC-V gives out a weak LR/SC
>> forward progress guarantee which does not satisfy qspinlock. But
>> many vendors could produce stronger forward guarantee LR/SC to
>> ensure the xchg_tail could be finished in time on any kind of
>> hart. T-HEAD is the vendor which implements strong forward
>> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
>> with errata help.
>>
>> T-HEAD early version of processors has the merge buffer delay
>> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>>
>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>> Signed-off-by: Guo Ren <guoren@kernel.org>
>> ---
>>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>>  5 files changed, 61 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
>> index 4745a5c57e7c..eb43677b13cc 100644
>> --- a/arch/riscv/Kconfig.errata
>> +++ b/arch/riscv/Kconfig.errata
>> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>>
>>  	  If you don't know what to do here, say "Y".
>>
>> +config ERRATA_THEAD_QSPINLOCK
>> +	bool "Apply T-Head queued spinlock errata"
>> +	depends on ERRATA_THEAD
>> +	default y
>> +	help
>> +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
>> +	  match the xchg_tail requirement of qspinlock.
>> +
>> +	  This will apply the QSPINLOCK errata to handle the non-standard
>> +	  behavior via using qspinlock instead of ticket_lock.
>> +
>> +	  If you don't know what to do here, say "Y".
>
> If this is to be applied, I would like to see a detailed explanation somewhere,
> preferably with citations, of:
>
> (a) The memory model requirements for qspinlock
> (b) Why, with arguments, RISC-V does not architecturally meet (a)
> (c) Why, with arguments, T-HEAD C9xx meets (a)
> (d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
>     meets (a)

I agree.

Just having a magic fence that makes qspinlocks stop livelocking on some 
processors is going to lead to a mess -- I'd argue this means those 
processors just don't provide the forward progress guarantee, but we'd 
really need something written down about what this new custom 
instruction aliasing as a fence does.

> As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
> (livelock freedom but no starvation freedom) are exactly the same as those in
> Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
> constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
> disagree; I would like to see your perspective.

It sounds to me like this processor might be quite broken: if it's 
permanently holding stores in a buffer we're going to have more issues 
than just qspinlock, pretty much anything concurrent is going to have 
issues -- and that's not just in the kernel, there's concurrent 
userspace code as well.

> -s
>
>> +
>>  endmenu # "CPU errata selection"
>> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
>> index 881729746d2e..d560dc45c0e7 100644
>> --- a/arch/riscv/errata/thead/errata.c
>> +++ b/arch/riscv/errata/thead/errata.c
>> @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
>>  	return false;
>>  }
>>
>> +static bool errata_probe_qspinlock(unsigned int stage,
>> +				   unsigned long arch_id, unsigned long impid)
>> +{
>> +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
>> +		return false;
>> +
>> +	/*
>> +	 * The queued_spinlock torture would get in livelock without
>> +	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
>> +	 * processors.
>> +	 */
>> +	if (arch_id == 0 && impid == 0 &&
>> +	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
>> +		return false;
>> +
>> +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>>  static u32 thead_errata_probe(unsigned int stage,
>>  			      unsigned long archid, unsigned long impid)
>>  {
>> @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
>>  	if (errata_probe_write_once(stage, archid, impid))
>>  		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
>>
>> +	if (errata_probe_qspinlock(stage, archid, impid))
>> +		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
>> +
>>  	return cpu_req_errata;
>>  }
>>
>> diff --git a/arch/riscv/include/asm/errata_list.h
>> b/arch/riscv/include/asm/errata_list.h
>> index fbb2b8d39321..a696d18d1b0d 100644
>> --- a/arch/riscv/include/asm/errata_list.h
>> +++ b/arch/riscv/include/asm/errata_list.h
>> @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
>>  	: "=r" (__ovl) :						\
>>  	: "memory")
>>
>> +static __always_inline bool
>> +riscv_has_errata_thead_qspinlock(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
>> +		asm_volatile_goto(
>> +		ALTERNATIVE(
>> +		"j	%l[l_no]", "nop",
>> +		THEAD_VENDOR_ID,
>> +		ERRATA_THEAD_QSPINLOCK,
>> +		CONFIG_ERRATA_THEAD_QSPINLOCK)
>> +		: : : : l_no);
>> +	} else {
>> +		goto l_no;
>> +	}
>> +
>> +	return true;
>> +l_no:
>> +	return false;
>> +}
>> +
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif
>> diff --git a/arch/riscv/include/asm/vendorid_list.h
>> b/arch/riscv/include/asm/vendorid_list.h
>> index 73078cfe4029..1f1d03877f5f 100644
>> --- a/arch/riscv/include/asm/vendorid_list.h
>> +++ b/arch/riscv/include/asm/vendorid_list.h
>> @@ -19,7 +19,8 @@
>>  #define	ERRATA_THEAD_CMO 1
>>  #define	ERRATA_THEAD_PMU 2
>>  #define	ERRATA_THEAD_WRITE_ONCE 3
>> -#define	ERRATA_THEAD_NUMBER 4
>> +#define	ERRATA_THEAD_QSPINLOCK 4
>> +#define	ERRATA_THEAD_NUMBER 5
>>  #endif
>>
>>  #endif
>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>> index f8dbbe1bbd34..d9694fe40a9a 100644
>> --- a/arch/riscv/kernel/cpufeature.c
>> +++ b/arch/riscv/kernel/cpufeature.c
>> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
>>  		 * spinlock value, the only way is to change from queued_spinlock to
>>  		 * ticket_spinlock, but can not be vice.
>>  		 */
>> -		if (!force_qspinlock) {
>> +		if (!force_qspinlock &&
>> +		    !riscv_has_errata_thead_qspinlock()) {
>>  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
>>  		}
>>  #endif
>> --
>> 2.36.1
>>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-09-13 18:54       ` Palmer Dabbelt
  0 siblings, 0 replies; 77+ messages in thread
From: Palmer Dabbelt @ 2023-09-13 18:54 UTC (permalink / raw)
  To: sorear
  Cc: guoren, Paul Walmsley, anup, peterz, mingo, Will Deacon, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, Catalin Marinas,
	Conor Dooley, xiaoguang.xing, Bjorn Topel, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016, linux-arch,
	linux-riscv, linux-doc, kvm, virtualization, linux-csky, guoren

On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
>> From: Guo Ren <guoren@linux.alibaba.com>
>>
>> According to qspinlock requirements, RISC-V gives out a weak LR/SC
>> forward progress guarantee which does not satisfy qspinlock. But
>> many vendors could produce stronger forward guarantee LR/SC to
>> ensure the xchg_tail could be finished in time on any kind of
>> hart. T-HEAD is the vendor which implements strong forward
>> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
>> with errata help.
>>
>> T-HEAD early version of processors has the merge buffer delay
>> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>>
>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>> Signed-off-by: Guo Ren <guoren@kernel.org>
>> ---
>>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>>  5 files changed, 61 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
>> index 4745a5c57e7c..eb43677b13cc 100644
>> --- a/arch/riscv/Kconfig.errata
>> +++ b/arch/riscv/Kconfig.errata
>> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>>
>>  	  If you don't know what to do here, say "Y".
>>
>> +config ERRATA_THEAD_QSPINLOCK
>> +	bool "Apply T-Head queued spinlock errata"
>> +	depends on ERRATA_THEAD
>> +	default y
>> +	help
>> +	  The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
>> +	  match the xchg_tail requirement of qspinlock.
>> +
>> +	  This will apply the QSPINLOCK errata to handle the non-standard
>> +	  behavior via using qspinlock instead of ticket_lock.
>> +
>> +	  If you don't know what to do here, say "Y".
>
> If this is to be applied, I would like to see a detailed explanation somewhere,
> preferably with citations, of:
>
> (a) The memory model requirements for qspinlock
> (b) Why, with arguments, RISC-V does not architecturally meet (a)
> (c) Why, with arguments, T-HEAD C9xx meets (a)
> (d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
>     meets (a)

I agree.

Just having a magic fence that makes qspinlocks stop livelocking on some 
processors is going to lead to a mess -- I'd argue this means those 
processors just don't provide the forward progress guarantee, but we'd 
really need something written down about what this new custom 
instruction aliasing as a fence does.

> As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
> (livelock freedom but no starvation freedom) are exactly the same as those in
> Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
> constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
> disagree; I would like to see your perspective.

It sounds to me like this processor might be quite broken: if it's 
permanently holding stores in a buffer we're going to have more issues 
than just qspinlock, pretty much anything concurrent is going to have 
issues -- and that's not just in the kernel, there's concurrent 
userspace code as well.

> -s
>
>> +
>>  endmenu # "CPU errata selection"
>> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
>> index 881729746d2e..d560dc45c0e7 100644
>> --- a/arch/riscv/errata/thead/errata.c
>> +++ b/arch/riscv/errata/thead/errata.c
>> @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
>>  	return false;
>>  }
>>
>> +static bool errata_probe_qspinlock(unsigned int stage,
>> +				   unsigned long arch_id, unsigned long impid)
>> +{
>> +	if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
>> +		return false;
>> +
>> +	/*
>> +	 * The queued_spinlock torture would get in livelock without
>> +	 * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
>> +	 * processors.
>> +	 */
>> +	if (arch_id == 0 && impid == 0 &&
>> +	    !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
>> +		return false;
>> +
>> +	if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
>> +		return true;
>> +
>> +	return false;
>> +}
>> +
>>  static u32 thead_errata_probe(unsigned int stage,
>>  			      unsigned long archid, unsigned long impid)
>>  {
>> @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
>>  	if (errata_probe_write_once(stage, archid, impid))
>>  		cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
>>
>> +	if (errata_probe_qspinlock(stage, archid, impid))
>> +		cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
>> +
>>  	return cpu_req_errata;
>>  }
>>
>> diff --git a/arch/riscv/include/asm/errata_list.h
>> b/arch/riscv/include/asm/errata_list.h
>> index fbb2b8d39321..a696d18d1b0d 100644
>> --- a/arch/riscv/include/asm/errata_list.h
>> +++ b/arch/riscv/include/asm/errata_list.h
>> @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(						\
>>  	: "=r" (__ovl) :						\
>>  	: "memory")
>>
>> +static __always_inline bool
>> +riscv_has_errata_thead_qspinlock(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
>> +		asm_volatile_goto(
>> +		ALTERNATIVE(
>> +		"j	%l[l_no]", "nop",
>> +		THEAD_VENDOR_ID,
>> +		ERRATA_THEAD_QSPINLOCK,
>> +		CONFIG_ERRATA_THEAD_QSPINLOCK)
>> +		: : : : l_no);
>> +	} else {
>> +		goto l_no;
>> +	}
>> +
>> +	return true;
>> +l_no:
>> +	return false;
>> +}
>> +
>>  #endif /* __ASSEMBLY__ */
>>
>>  #endif
>> diff --git a/arch/riscv/include/asm/vendorid_list.h
>> b/arch/riscv/include/asm/vendorid_list.h
>> index 73078cfe4029..1f1d03877f5f 100644
>> --- a/arch/riscv/include/asm/vendorid_list.h
>> +++ b/arch/riscv/include/asm/vendorid_list.h
>> @@ -19,7 +19,8 @@
>>  #define	ERRATA_THEAD_CMO 1
>>  #define	ERRATA_THEAD_PMU 2
>>  #define	ERRATA_THEAD_WRITE_ONCE 3
>> -#define	ERRATA_THEAD_NUMBER 4
>> +#define	ERRATA_THEAD_QSPINLOCK 4
>> +#define	ERRATA_THEAD_NUMBER 5
>>  #endif
>>
>>  #endif
>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>> index f8dbbe1bbd34..d9694fe40a9a 100644
>> --- a/arch/riscv/kernel/cpufeature.c
>> +++ b/arch/riscv/kernel/cpufeature.c
>> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
>>  		 * spinlock value, the only way is to change from queued_spinlock to
>>  		 * ticket_spinlock, but can not be vice.
>>  		 */
>> -		if (!force_qspinlock) {
>> +		if (!force_qspinlock &&
>> +		    !riscv_has_errata_thead_qspinlock()) {
>>  			set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
>>  		}
>>  #endif
>> --
>> 2.36.1
>>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-09-13 18:54       ` Palmer Dabbelt
  (?)
@ 2023-09-13 19:32         ` Waiman Long
  -1 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-09-13 19:32 UTC (permalink / raw)
  To: Palmer Dabbelt, sorear
  Cc: guoren, Paul Walmsley, anup, peterz, mingo, Will Deacon,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, Catalin Marinas,
	Conor Dooley, xiaoguang.xing, Bjorn Topel, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016, linux-arch,
	linux-riscv, linux-doc, kvm, virtualization, linux-csky, guoren

On 9/13/23 14:54, Palmer Dabbelt wrote:
> On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
>> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> According to qspinlock requirements, RISC-V gives out a weak LR/SC
>>> forward progress guarantee which does not satisfy qspinlock. But
>>> many vendors could produce stronger forward guarantee LR/SC to
>>> ensure the xchg_tail could be finished in time on any kind of
>>> hart. T-HEAD is the vendor which implements strong forward
>>> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
>>> with errata help.
>>>
>>> T-HEAD early version of processors has the merge buffer delay
>>> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>>>
>>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>>> Signed-off-by: Guo Ren <guoren@kernel.org>
>>> ---
>>>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>>>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>>>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>>>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>>>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>>>  5 files changed, 61 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
>>> index 4745a5c57e7c..eb43677b13cc 100644
>>> --- a/arch/riscv/Kconfig.errata
>>> +++ b/arch/riscv/Kconfig.errata
>>> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>>>
>>>        If you don't know what to do here, say "Y".
>>>
>>> +config ERRATA_THEAD_QSPINLOCK
>>> +    bool "Apply T-Head queued spinlock errata"
>>> +    depends on ERRATA_THEAD
>>> +    default y
>>> +    help
>>> +      The T-HEAD C9xx processors implement strong fwd guarantee 
>>> LR/SC to
>>> +      match the xchg_tail requirement of qspinlock.
>>> +
>>> +      This will apply the QSPINLOCK errata to handle the non-standard
>>> +      behavior via using qspinlock instead of ticket_lock.
>>> +
>>> +      If you don't know what to do here, say "Y".
>>
>> If this is to be applied, I would like to see a detailed explanation 
>> somewhere,
>> preferably with citations, of:
>>
>> (a) The memory model requirements for qspinlock

The part of qspinlock that causes problem with many RISC architectures 
is its use of a 16-bit xchg() function call which many RISC 
architectures cannot do it natively and have to be emulated with 
hopefully some forward progress guarantee. Except that one call, the 
other atomic operations are all 32 bit in size.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-09-13 19:32         ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-09-13 19:32 UTC (permalink / raw)
  To: Palmer Dabbelt, sorear
  Cc: guoren, Paul Walmsley, anup, peterz, mingo, Will Deacon,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, Catalin Marinas,
	Conor Dooley, xiaoguang.xing, Bjorn Topel, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016, linux-arch,
	linux-riscv, linux-doc, kvm, virtualization, linux-csky, guoren

On 9/13/23 14:54, Palmer Dabbelt wrote:
> On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
>> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> According to qspinlock requirements, RISC-V gives out a weak LR/SC
>>> forward progress guarantee which does not satisfy qspinlock. But
>>> many vendors could produce stronger forward guarantee LR/SC to
>>> ensure the xchg_tail could be finished in time on any kind of
>>> hart. T-HEAD is the vendor which implements strong forward
>>> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
>>> with errata help.
>>>
>>> T-HEAD early version of processors has the merge buffer delay
>>> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>>>
>>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>>> Signed-off-by: Guo Ren <guoren@kernel.org>
>>> ---
>>>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>>>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>>>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>>>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>>>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>>>  5 files changed, 61 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
>>> index 4745a5c57e7c..eb43677b13cc 100644
>>> --- a/arch/riscv/Kconfig.errata
>>> +++ b/arch/riscv/Kconfig.errata
>>> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>>>
>>>        If you don't know what to do here, say "Y".
>>>
>>> +config ERRATA_THEAD_QSPINLOCK
>>> +    bool "Apply T-Head queued spinlock errata"
>>> +    depends on ERRATA_THEAD
>>> +    default y
>>> +    help
>>> +      The T-HEAD C9xx processors implement strong fwd guarantee 
>>> LR/SC to
>>> +      match the xchg_tail requirement of qspinlock.
>>> +
>>> +      This will apply the QSPINLOCK errata to handle the non-standard
>>> +      behavior via using qspinlock instead of ticket_lock.
>>> +
>>> +      If you don't know what to do here, say "Y".
>>
>> If this is to be applied, I would like to see a detailed explanation 
>> somewhere,
>> preferably with citations, of:
>>
>> (a) The memory model requirements for qspinlock

The part of qspinlock that causes problem with many RISC architectures 
is its use of a 16-bit xchg() function call which many RISC 
architectures cannot do it natively and have to be emulated with 
hopefully some forward progress guarantee. Except that one call, the 
other atomic operations are all 32 bit in size.

Cheers,
Longman


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-09-13 19:32         ` Waiman Long
  0 siblings, 0 replies; 77+ messages in thread
From: Waiman Long @ 2023-09-13 19:32 UTC (permalink / raw)
  To: Palmer Dabbelt, sorear
  Cc: guoren, kvm, linux-doc, peterz, Catalin Marinas, Bjorn Topel,
	virtualization, Conor Dooley, guoren, jszhang, linux-riscv,
	Will Deacon, keescook, linux-arch, anup, linux-csky,
	xiaoguang.xing, mingo, greentime.hu, ajones, alexghiti, paulmck,
	boqun.feng, rostedt, Paul Walmsley, tglx, rdunlap, wuwei2016,
	wefu

On 9/13/23 14:54, Palmer Dabbelt wrote:
> On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
>> On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
>>> From: Guo Ren <guoren@linux.alibaba.com>
>>>
>>> According to qspinlock requirements, RISC-V gives out a weak LR/SC
>>> forward progress guarantee which does not satisfy qspinlock. But
>>> many vendors could produce stronger forward guarantee LR/SC to
>>> ensure the xchg_tail could be finished in time on any kind of
>>> hart. T-HEAD is the vendor which implements strong forward
>>> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
>>> with errata help.
>>>
>>> T-HEAD early version of processors has the merge buffer delay
>>> problem, so we need ERRATA_WRITEONCE to support qspinlock.
>>>
>>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>>> Signed-off-by: Guo Ren <guoren@kernel.org>
>>> ---
>>>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
>>>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
>>>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
>>>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
>>>  arch/riscv/kernel/cpufeature.c         |  3 ++-
>>>  5 files changed, 61 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
>>> index 4745a5c57e7c..eb43677b13cc 100644
>>> --- a/arch/riscv/Kconfig.errata
>>> +++ b/arch/riscv/Kconfig.errata
>>> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
>>>
>>>        If you don't know what to do here, say "Y".
>>>
>>> +config ERRATA_THEAD_QSPINLOCK
>>> +    bool "Apply T-Head queued spinlock errata"
>>> +    depends on ERRATA_THEAD
>>> +    default y
>>> +    help
>>> +      The T-HEAD C9xx processors implement strong fwd guarantee 
>>> LR/SC to
>>> +      match the xchg_tail requirement of qspinlock.
>>> +
>>> +      This will apply the QSPINLOCK errata to handle the non-standard
>>> +      behavior via using qspinlock instead of ticket_lock.
>>> +
>>> +      If you don't know what to do here, say "Y".
>>
>> If this is to be applied, I would like to see a detailed explanation 
>> somewhere,
>> preferably with citations, of:
>>
>> (a) The memory model requirements for qspinlock

The part of qspinlock that causes problem with many RISC architectures 
is its use of a 16-bit xchg() function call which many RISC 
architectures cannot do it natively and have to be emulated with 
hopefully some forward progress guarantee. Except that one call, the 
other atomic operations are all 32 bit in size.

Cheers,
Longman

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
  2023-09-13 18:54       ` Palmer Dabbelt
@ 2023-09-14  3:31         ` Guo Ren
  -1 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-09-14  3:31 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: sorear, Paul Walmsley, anup, peterz, mingo, Will Deacon, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, Catalin Marinas,
	Conor Dooley, xiaoguang.xing, Bjorn Topel, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016, linux-arch,
	linux-riscv, linux-doc, kvm, virtualization, linux-csky, guoren

On Thu, Sep 14, 2023 at 2:54 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
> > On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
> >> From: Guo Ren <guoren@linux.alibaba.com>
> >>
> >> According to qspinlock requirements, RISC-V gives out a weak LR/SC
> >> forward progress guarantee which does not satisfy qspinlock. But
> >> many vendors could produce stronger forward guarantee LR/SC to
> >> ensure the xchg_tail could be finished in time on any kind of
> >> hart. T-HEAD is the vendor which implements strong forward
> >> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> >> with errata help.
> >>
> >> T-HEAD early version of processors has the merge buffer delay
> >> problem, so we need ERRATA_WRITEONCE to support qspinlock.
> >>
> >> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >> Signed-off-by: Guo Ren <guoren@kernel.org>
> >> ---
> >>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
> >>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
> >>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
> >>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
> >>  arch/riscv/kernel/cpufeature.c         |  3 ++-
> >>  5 files changed, 61 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> >> index 4745a5c57e7c..eb43677b13cc 100644
> >> --- a/arch/riscv/Kconfig.errata
> >> +++ b/arch/riscv/Kconfig.errata
> >> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> >>
> >>        If you don't know what to do here, say "Y".
> >>
> >> +config ERRATA_THEAD_QSPINLOCK
> >> +    bool "Apply T-Head queued spinlock errata"
> >> +    depends on ERRATA_THEAD
> >> +    default y
> >> +    help
> >> +      The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> >> +      match the xchg_tail requirement of qspinlock.
> >> +
> >> +      This will apply the QSPINLOCK errata to handle the non-standard
> >> +      behavior via using qspinlock instead of ticket_lock.
> >> +
> >> +      If you don't know what to do here, say "Y".
> >
> > If this is to be applied, I would like to see a detailed explanation somewhere,
> > preferably with citations, of:
> >
> > (a) The memory model requirements for qspinlock
> > (b) Why, with arguments, RISC-V does not architecturally meet (a)
> > (c) Why, with arguments, T-HEAD C9xx meets (a)
> > (d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
> >     meets (a)
>
> I agree.
>
> Just having a magic fence that makes qspinlocks stop livelocking on some
> processors is going to lead to a mess -- I'd argue this means those
> processors just don't provide the forward progress guarantee, but we'd
> really need something written down about what this new custom
> instruction aliasing as a fence does.
The "magic fence" is not related to the LR/SC forward progress
guarantee, and it's our processors' store buffer hardware problem that
needs to be fixed. Not only is qspinlock suffering on it, but also
kernel/locking/osq_lock.c.

This is about this patch:
https://lore.kernel.org/linux-riscv/20230910082911.3378782-10-guoren@kernel.org/
I've written down the root cause of that patch: The "new custom fence
instruction" triggers the store buffer flush.

Only the normal store instruction has the problem, and atomic stores
(including LR/SC) are okay.

>
> > As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
> > (livelock freedom but no starvation freedom) are exactly the same as those in
> > Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
> > constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
> > disagree; I would like to see your perspective.
>
> It sounds to me like this processor might be quite broken: if it's
> permanently holding stores in a buffer we're going to have more issues
> than just qspinlock, pretty much anything concurrent is going to have
> issues -- and that's not just in the kernel, there's concurrent
> userspace code as well.
Yes, the userspace scenarios are our worries because modifying the
userspace atomic library is impossible. Now, we are stress-testing
various userspace parallel applications, especially userspace queued
spinlock, on the 128-core hardware platform.
We haven't detected the problem in the userspace, and it could be
because of timer interrupts, which breaks the store buffer waiting.

>
> > -s
> >
> >> +
> >>  endmenu # "CPU errata selection"
> >> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> >> index 881729746d2e..d560dc45c0e7 100644
> >> --- a/arch/riscv/errata/thead/errata.c
> >> +++ b/arch/riscv/errata/thead/errata.c
> >> @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
> >>      return false;
> >>  }
> >>
> >> +static bool errata_probe_qspinlock(unsigned int stage,
> >> +                               unsigned long arch_id, unsigned long impid)
> >> +{
> >> +    if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
> >> +            return false;
> >> +
> >> +    /*
> >> +     * The queued_spinlock torture would get in livelock without
> >> +     * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
> >> +     * processors.
> >> +     */
> >> +    if (arch_id == 0 && impid == 0 &&
> >> +        !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
> >> +            return false;
> >> +
> >> +    if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> >> +            return true;
> >> +
> >> +    return false;
> >> +}
> >> +
> >>  static u32 thead_errata_probe(unsigned int stage,
> >>                            unsigned long archid, unsigned long impid)
> >>  {
> >> @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
> >>      if (errata_probe_write_once(stage, archid, impid))
> >>              cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
> >>
> >> +    if (errata_probe_qspinlock(stage, archid, impid))
> >> +            cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
> >> +
> >>      return cpu_req_errata;
> >>  }
> >>
> >> diff --git a/arch/riscv/include/asm/errata_list.h
> >> b/arch/riscv/include/asm/errata_list.h
> >> index fbb2b8d39321..a696d18d1b0d 100644
> >> --- a/arch/riscv/include/asm/errata_list.h
> >> +++ b/arch/riscv/include/asm/errata_list.h
> >> @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(                                               \
> >>      : "=r" (__ovl) :                                                \
> >>      : "memory")
> >>
> >> +static __always_inline bool
> >> +riscv_has_errata_thead_qspinlock(void)
> >> +{
> >> +    if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> >> +            asm_volatile_goto(
> >> +            ALTERNATIVE(
> >> +            "j      %l[l_no]", "nop",
> >> +            THEAD_VENDOR_ID,
> >> +            ERRATA_THEAD_QSPINLOCK,
> >> +            CONFIG_ERRATA_THEAD_QSPINLOCK)
> >> +            : : : : l_no);
> >> +    } else {
> >> +            goto l_no;
> >> +    }
> >> +
> >> +    return true;
> >> +l_no:
> >> +    return false;
> >> +}
> >> +
> >>  #endif /* __ASSEMBLY__ */
> >>
> >>  #endif
> >> diff --git a/arch/riscv/include/asm/vendorid_list.h
> >> b/arch/riscv/include/asm/vendorid_list.h
> >> index 73078cfe4029..1f1d03877f5f 100644
> >> --- a/arch/riscv/include/asm/vendorid_list.h
> >> +++ b/arch/riscv/include/asm/vendorid_list.h
> >> @@ -19,7 +19,8 @@
> >>  #define     ERRATA_THEAD_CMO 1
> >>  #define     ERRATA_THEAD_PMU 2
> >>  #define     ERRATA_THEAD_WRITE_ONCE 3
> >> -#define     ERRATA_THEAD_NUMBER 4
> >> +#define     ERRATA_THEAD_QSPINLOCK 4
> >> +#define     ERRATA_THEAD_NUMBER 5
> >>  #endif
> >>
> >>  #endif
> >> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> >> index f8dbbe1bbd34..d9694fe40a9a 100644
> >> --- a/arch/riscv/kernel/cpufeature.c
> >> +++ b/arch/riscv/kernel/cpufeature.c
> >> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> >>               * spinlock value, the only way is to change from queued_spinlock to
> >>               * ticket_spinlock, but can not be vice.
> >>               */
> >> -            if (!force_qspinlock) {
> >> +            if (!force_qspinlock &&
> >> +                !riscv_has_errata_thead_qspinlock()) {
> >>                      set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> >>              }
> >>  #endif
> >> --
> >> 2.36.1
> >>
> >>
> >> _______________________________________________
> >> linux-riscv mailing list
> >> linux-riscv@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-riscv



-- 
Best Regards
 Guo Ren

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK
@ 2023-09-14  3:31         ` Guo Ren
  0 siblings, 0 replies; 77+ messages in thread
From: Guo Ren @ 2023-09-14  3:31 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: sorear, Paul Walmsley, anup, peterz, mingo, Will Deacon, longman,
	boqun.feng, tglx, paulmck, rostedt, rdunlap, Catalin Marinas,
	Conor Dooley, xiaoguang.xing, Bjorn Topel, alexghiti, keescook,
	greentime.hu, ajones, jszhang, wefu, wuwei2016, linux-arch,
	linux-riscv, linux-doc, kvm, virtualization, linux-csky, guoren

On Thu, Sep 14, 2023 at 2:54 AM Palmer Dabbelt <palmer@rivosinc.com> wrote:
>
> On Sun, 06 Aug 2023 22:23:34 PDT (-0700), sorear@fastmail.com wrote:
> > On Wed, Aug 2, 2023, at 12:46 PM, guoren@kernel.org wrote:
> >> From: Guo Ren <guoren@linux.alibaba.com>
> >>
> >> According to qspinlock requirements, RISC-V gives out a weak LR/SC
> >> forward progress guarantee which does not satisfy qspinlock. But
> >> many vendors could produce stronger forward guarantee LR/SC to
> >> ensure the xchg_tail could be finished in time on any kind of
> >> hart. T-HEAD is the vendor which implements strong forward
> >> guarantee LR/SC instruction pairs, so enable qspinlock for T-HEAD
> >> with errata help.
> >>
> >> T-HEAD early version of processors has the merge buffer delay
> >> problem, so we need ERRATA_WRITEONCE to support qspinlock.
> >>
> >> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> >> Signed-off-by: Guo Ren <guoren@kernel.org>
> >> ---
> >>  arch/riscv/Kconfig.errata              | 13 +++++++++++++
> >>  arch/riscv/errata/thead/errata.c       | 24 ++++++++++++++++++++++++
> >>  arch/riscv/include/asm/errata_list.h   | 20 ++++++++++++++++++++
> >>  arch/riscv/include/asm/vendorid_list.h |  3 ++-
> >>  arch/riscv/kernel/cpufeature.c         |  3 ++-
> >>  5 files changed, 61 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
> >> index 4745a5c57e7c..eb43677b13cc 100644
> >> --- a/arch/riscv/Kconfig.errata
> >> +++ b/arch/riscv/Kconfig.errata
> >> @@ -96,4 +96,17 @@ config ERRATA_THEAD_WRITE_ONCE
> >>
> >>        If you don't know what to do here, say "Y".
> >>
> >> +config ERRATA_THEAD_QSPINLOCK
> >> +    bool "Apply T-Head queued spinlock errata"
> >> +    depends on ERRATA_THEAD
> >> +    default y
> >> +    help
> >> +      The T-HEAD C9xx processors implement strong fwd guarantee LR/SC to
> >> +      match the xchg_tail requirement of qspinlock.
> >> +
> >> +      This will apply the QSPINLOCK errata to handle the non-standard
> >> +      behavior via using qspinlock instead of ticket_lock.
> >> +
> >> +      If you don't know what to do here, say "Y".
> >
> > If this is to be applied, I would like to see a detailed explanation somewhere,
> > preferably with citations, of:
> >
> > (a) The memory model requirements for qspinlock
> > (b) Why, with arguments, RISC-V does not architecturally meet (a)
> > (c) Why, with arguments, T-HEAD C9xx meets (a)
> > (d) Why at least one other architecture which defines ARCH_USE_QUEUED_SPINLOCKS
> >     meets (a)
>
> I agree.
>
> Just having a magic fence that makes qspinlocks stop livelocking on some
> processors is going to lead to a mess -- I'd argue this means those
> processors just don't provide the forward progress guarantee, but we'd
> really need something written down about what this new custom
> instruction aliasing as a fence does.
The "magic fence" is not related to the LR/SC forward progress
guarantee, and it's our processors' store buffer hardware problem that
needs to be fixed. Not only is qspinlock suffering on it, but also
kernel/locking/osq_lock.c.

This is about this patch:
https://lore.kernel.org/linux-riscv/20230910082911.3378782-10-guoren@kernel.org/
I've written down the root cause of that patch: The "new custom fence
instruction" triggers the store buffer flush.

Only the normal store instruction has the problem, and atomic stores
(including LR/SC) are okay.

>
> > As far as I can tell, the RISC-V guarantees concerning constrained LR/SC loops
> > (livelock freedom but no starvation freedom) are exactly the same as those in
> > Armv8 (as of 0487F.c) for equivalent loops, and xchg_tail compiles to a
> > constrained LR/SC loop with guaranteed eventual success (with -O1).  Clearly you
> > disagree; I would like to see your perspective.
>
> It sounds to me like this processor might be quite broken: if it's
> permanently holding stores in a buffer we're going to have more issues
> than just qspinlock, pretty much anything concurrent is going to have
> issues -- and that's not just in the kernel, there's concurrent
> userspace code as well.
Yes, the userspace scenarios are our worries because modifying the
userspace atomic library is impossible. Now, we are stress-testing
various userspace parallel applications, especially userspace queued
spinlock, on the 128-core hardware platform.
We haven't detected the problem in the userspace, and it could be
because of timer interrupts, which breaks the store buffer waiting.

>
> > -s
> >
> >> +
> >>  endmenu # "CPU errata selection"
> >> diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
> >> index 881729746d2e..d560dc45c0e7 100644
> >> --- a/arch/riscv/errata/thead/errata.c
> >> +++ b/arch/riscv/errata/thead/errata.c
> >> @@ -86,6 +86,27 @@ static bool errata_probe_write_once(unsigned int stage,
> >>      return false;
> >>  }
> >>
> >> +static bool errata_probe_qspinlock(unsigned int stage,
> >> +                               unsigned long arch_id, unsigned long impid)
> >> +{
> >> +    if (!IS_ENABLED(CONFIG_ERRATA_THEAD_QSPINLOCK))
> >> +            return false;
> >> +
> >> +    /*
> >> +     * The queued_spinlock torture would get in livelock without
> >> +     * ERRATA_THEAD_WRITE_ONCE fixup for the early versions of T-HEAD
> >> +     * processors.
> >> +     */
> >> +    if (arch_id == 0 && impid == 0 &&
> >> +        !IS_ENABLED(CONFIG_ERRATA_THEAD_WRITE_ONCE))
> >> +            return false;
> >> +
> >> +    if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> >> +            return true;
> >> +
> >> +    return false;
> >> +}
> >> +
> >>  static u32 thead_errata_probe(unsigned int stage,
> >>                            unsigned long archid, unsigned long impid)
> >>  {
> >> @@ -103,6 +124,9 @@ static u32 thead_errata_probe(unsigned int stage,
> >>      if (errata_probe_write_once(stage, archid, impid))
> >>              cpu_req_errata |= BIT(ERRATA_THEAD_WRITE_ONCE);
> >>
> >> +    if (errata_probe_qspinlock(stage, archid, impid))
> >> +            cpu_req_errata |= BIT(ERRATA_THEAD_QSPINLOCK);
> >> +
> >>      return cpu_req_errata;
> >>  }
> >>
> >> diff --git a/arch/riscv/include/asm/errata_list.h
> >> b/arch/riscv/include/asm/errata_list.h
> >> index fbb2b8d39321..a696d18d1b0d 100644
> >> --- a/arch/riscv/include/asm/errata_list.h
> >> +++ b/arch/riscv/include/asm/errata_list.h
> >> @@ -141,6 +141,26 @@ asm volatile(ALTERNATIVE(                                               \
> >>      : "=r" (__ovl) :                                                \
> >>      : "memory")
> >>
> >> +static __always_inline bool
> >> +riscv_has_errata_thead_qspinlock(void)
> >> +{
> >> +    if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
> >> +            asm_volatile_goto(
> >> +            ALTERNATIVE(
> >> +            "j      %l[l_no]", "nop",
> >> +            THEAD_VENDOR_ID,
> >> +            ERRATA_THEAD_QSPINLOCK,
> >> +            CONFIG_ERRATA_THEAD_QSPINLOCK)
> >> +            : : : : l_no);
> >> +    } else {
> >> +            goto l_no;
> >> +    }
> >> +
> >> +    return true;
> >> +l_no:
> >> +    return false;
> >> +}
> >> +
> >>  #endif /* __ASSEMBLY__ */
> >>
> >>  #endif
> >> diff --git a/arch/riscv/include/asm/vendorid_list.h
> >> b/arch/riscv/include/asm/vendorid_list.h
> >> index 73078cfe4029..1f1d03877f5f 100644
> >> --- a/arch/riscv/include/asm/vendorid_list.h
> >> +++ b/arch/riscv/include/asm/vendorid_list.h
> >> @@ -19,7 +19,8 @@
> >>  #define     ERRATA_THEAD_CMO 1
> >>  #define     ERRATA_THEAD_PMU 2
> >>  #define     ERRATA_THEAD_WRITE_ONCE 3
> >> -#define     ERRATA_THEAD_NUMBER 4
> >> +#define     ERRATA_THEAD_QSPINLOCK 4
> >> +#define     ERRATA_THEAD_NUMBER 5
> >>  #endif
> >>
> >>  #endif
> >> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> >> index f8dbbe1bbd34..d9694fe40a9a 100644
> >> --- a/arch/riscv/kernel/cpufeature.c
> >> +++ b/arch/riscv/kernel/cpufeature.c
> >> @@ -342,7 +342,8 @@ void __init riscv_fill_hwcap(void)
> >>               * spinlock value, the only way is to change from queued_spinlock to
> >>               * ticket_spinlock, but can not be vice.
> >>               */
> >> -            if (!force_qspinlock) {
> >> +            if (!force_qspinlock &&
> >> +                !riscv_has_errata_thead_qspinlock()) {
> >>                      set_bit(RISCV_ISA_EXT_XTICKETLOCK, isainfo->isa);
> >>              }
> >>  #endif
> >> --
> >> 2.36.1
> >>
> >>
> >> _______________________________________________
> >> linux-riscv mailing list
> >> linux-riscv@lists.infradead.org
> >> http://lists.infradead.org/mailman/listinfo/linux-riscv



-- 
Best Regards
 Guo Ren

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2023-09-14  3:32 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-02 16:46 [PATCH V10 00/19] riscv: Add Native/Paravirt/CNA qspinlock support guoren
2023-08-02 16:46 ` guoren
2023-08-02 16:46 ` [PATCH V10 01/19] asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 02/19] asm-generic: ticket-lock: Move into ticket_spinlock.h guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 03/19] riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 04/19] riscv: qspinlock: Add basic queued_spinlock support guoren
2023-08-02 16:46   ` guoren
2023-08-11 19:34   ` Waiman Long
2023-08-11 19:34     ` Waiman Long
2023-08-11 19:34     ` Waiman Long
2023-08-12  0:18     ` Guo Ren
2023-08-12  0:18       ` Guo Ren
2023-08-02 16:46 ` [PATCH V10 05/19] riscv: qspinlock: Introduce combo spinlock guoren
2023-08-02 16:46   ` guoren
2023-08-11 19:51   ` Waiman Long
2023-08-11 19:51     ` Waiman Long
2023-08-11 19:51     ` Waiman Long
2023-08-12  0:22     ` Guo Ren
2023-08-12  0:22       ` Guo Ren
2023-08-02 16:46 ` [PATCH V10 06/19] riscv: qspinlock: Allow force qspinlock from the command line guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 07/19] riscv: qspinlock: errata: Introduce ERRATA_THEAD_QSPINLOCK guoren
2023-08-02 16:46   ` guoren
2023-08-04  9:05   ` Conor Dooley
2023-08-04  9:05     ` Conor Dooley
2023-08-04  9:53     ` Guo Ren
2023-08-04  9:53       ` Guo Ren
2023-08-04 10:06       ` Conor Dooley
2023-08-04 10:06         ` Conor Dooley
2023-08-05  1:28         ` Guo Ren
2023-08-05  1:28           ` Guo Ren
2023-08-07  5:23   ` Stefan O'Rear
2023-08-07  5:23     ` Stefan O'Rear
2023-08-08  2:12     ` Guo Ren
2023-08-08  2:12       ` Guo Ren
2023-09-13 18:54     ` Palmer Dabbelt
2023-09-13 18:54       ` Palmer Dabbelt
2023-09-13 19:32       ` Waiman Long
2023-09-13 19:32         ` Waiman Long
2023-09-13 19:32         ` Waiman Long
2023-09-14  3:31       ` Guo Ren
2023-09-14  3:31         ` Guo Ren
2023-08-02 16:46 ` [PATCH V10 08/19] riscv: qspinlock: Use new static key for controlling call of virt_spin_lock() guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 09/19] RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 10/19] RISC-V: paravirt: pvqspinlock: KVM: " guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 11/19] RISC-V: paravirt: pvqspinlock: KVM: Implement kvm_sbi_ext_pvlock_kick_cpu() guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 12/19] RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 13/19] RISC-V: paravirt: pvqspinlock: Remove unnecessary definitions of cmpxchg & xchg guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 14/19] RISC-V: paravirt: pvqspinlock: Add xchg8 & cmpxchg_small support guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 15/19] RISC-V: paravirt: pvqspinlock: Add SBI implementation guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 16/19] RISC-V: paravirt: pvqspinlock: Add kconfig entry guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:46 ` [PATCH V10 17/19] RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait guoren
2023-08-02 16:46   ` guoren
2023-08-02 16:47 ` [PATCH V10 18/19] locking/qspinlock: Move pv_ops into x86 directory guoren
2023-08-02 16:47   ` guoren
2023-08-11 20:42   ` Waiman Long
2023-08-11 20:42     ` Waiman Long
2023-08-11 20:42     ` Waiman Long
2023-08-12  0:24     ` Guo Ren
2023-08-12  0:24       ` Guo Ren
2023-08-12  0:47       ` Waiman Long
2023-08-12  0:47         ` Waiman Long
2023-08-12  0:47         ` Waiman Long
2023-08-02 16:47 ` [PATCH V10 19/19] locking/qspinlock: riscv: Add Compact NUMA-aware lock support guoren
2023-08-02 16:47   ` guoren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.