All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] RISC-V: use generic spinlock and rwlock
@ 2019-02-11  4:38 Michael Clark
  2019-02-11  4:38 ` [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short Michael Clark
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Michael Clark @ 2019-02-11  4:38 UTC (permalink / raw)
  To: Linux RISC-V; +Cc: RISC-V Patches, Linux MIPS, Michael Clark

Use fair spinlocks and rwlocks on RISC-V.

Investigated use of ticket spinlocks for RISC-V so that we have fair
spinlocks under contention. After making generic changes, found that
queue spinlocks require atomic operations on small words (RISC-V only
supports LR/SC on 32-bit or 64-bit words); so this series borrows
support for small word atomics from the MIPS port, updates the RISC-V
port to use the generic spinlocks and rwlocks, and finally fixes a bug
found during visual inspection of the MIPS small word atomics support. 

The queue spinlocks and rwlocks are in asm-generic, so this series 
reduces platform specific code in the RISC-V port, besides adding small
word atomics support, which expands generic support for atomics and is
presumably useful elsewhere.

The patch series has been tested successfully with SMP in RISC-V QEMU
using the riscv-linux-4.20 branch: https://github.com/riscv/riscv-linux
and applies cleanly to torvalds/master.

Note: acquire or release semantics are passed through to the underlying
cmpxchg implementation for the minimum atomic operation word size (32b).
The aligned larger word load used to fetch and mask the previous value
of the word surrounding the small word for the atomic operation, is
performed relaxed before the larger word atomic cmpxchg operation. One
assumes the MIPS code has been battle tested however the RISC-V Linux
memory model has additional ordering constraints for acquire/release.

_relaxed_: the aligned large word load is relaxed, so this is okay.

_acquire_: the aligned large word load is encompassed by "fence r,rw"
acquire barrier _following_ the compare and swap operation, thus is
correctly before the acquire barrier, and locally is a syntactic
dependency for the compare and swap operation thus is correctly ordered.

_release_: the aligned large word load occurs before the "fence rw,w"
_preceeding_ the compare and swap, thus it is technically a load
before write barrier, and the fence implies additional ordering of
the load before the compare and swap. This adds additional ordering
for the first loop iteration. It is a load, and a depdendent load and
thus does not require any additional ordering. In this case, ordering
could be relaxed by performed the aligned large word load after the
barrier preceeding the compare and swap, however, this would require a
special variant of the cmpxchg asm. The operation is not invalid, rather
the release fence adds additional explicit ordering for the aligned
large word load that is technically not required. This may show up as
an additional LR/SC loop iteration under contention due to non optimal
fence placement.

QEMU on x86 is not representative of real hardware and is likely more
tolerant than weakly ordered hardware. Further testing is advised,
ideally on real hardware or an agressive OoO simulator that has been
verified against the RISC-V Memory Model.

Michael Clark (3):
  RISC-V: implement xchg_small and cmpxchg_small for char and short
  RISC-V: convert custom spinlock/rwlock to generic qspinlock/qrwlock
  MIPS: fix truncation in __cmpxchg_small for short values

 arch/mips/kernel/cmpxchg.c              |   2 +-
 arch/riscv/Kconfig                      |   2 +
 arch/riscv/include/asm/cmpxchg.h        |  54 +++++++++
 arch/riscv/include/asm/mcs_spinlock.h   |   7 ++
 arch/riscv/include/asm/qrwlock.h        |   8 ++
 arch/riscv/include/asm/qspinlock.h      |   8 ++
 arch/riscv/include/asm/spinlock.h       | 141 +-----------------------
 arch/riscv/include/asm/spinlock_types.h |  33 +-----
 arch/riscv/kernel/Makefile              |   1 +
 arch/riscv/kernel/cmpxchg.c             | 118 ++++++++++++++++++++
 10 files changed, 206 insertions(+), 168 deletions(-)
 create mode 100644 arch/riscv/include/asm/mcs_spinlock.h
 create mode 100644 arch/riscv/include/asm/qrwlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock.h
 create mode 100644 arch/riscv/kernel/cmpxchg.c

-- 
2.17.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short
  2019-02-11  4:38 [PATCH 0/3] RISC-V: use generic spinlock and rwlock Michael Clark
@ 2019-02-11  4:38 ` Michael Clark
  2019-02-12  7:17   ` Christoph Hellwig
  2019-02-11  4:38 ` [PATCH 2/3] RISC-V: convert custom spinlock/rwlock to generic qspinlock/qrwlock Michael Clark
  2019-02-11  4:38 ` [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values Michael Clark
  2 siblings, 1 reply; 11+ messages in thread
From: Michael Clark @ 2019-02-11  4:38 UTC (permalink / raw)
  To: Linux RISC-V; +Cc: RISC-V Patches, Michael Clark

This patch implements xchg and cmpxchg for char and short. xchg
and cmpxchg on small words are necessary to use the generic
qspinlock and qrwlock which are enabled in a subsequent patch.

The MIPS cmpxchg code is adapted into a macro template to implement
the additional three variants (relaxed|acquire|release)] supported
by the RISC-V memory model.

Cc: RISC-V Patches <patches@groups.riscv.org>
Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
Signed-off-by: Michael Clark <michaeljclark@mac.com>
---
 arch/riscv/include/asm/cmpxchg.h |  54 ++++++++++++++
 arch/riscv/kernel/Makefile       |   1 +
 arch/riscv/kernel/cmpxchg.c      | 118 +++++++++++++++++++++++++++++++
 3 files changed, 173 insertions(+)
 create mode 100644 arch/riscv/kernel/cmpxchg.c

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index c12833f7b6bd..64b3d36e2d6e 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -19,12 +19,34 @@
 #include <asm/barrier.h>
 #include <asm/fence.h>
 
+extern unsigned long __xchg_relaxed_small(volatile void *ptr, unsigned long new,
+					  unsigned int size);
+extern unsigned long __xchg_acquire_small(volatile void *ptr, unsigned long new,
+					  unsigned int size);
+extern unsigned long __xchg_release_small(volatile void *ptr, unsigned long new,
+					  unsigned int size);
+extern unsigned long __xchg_small(volatile void *ptr, unsigned long new,
+				  unsigned int size);
+
+extern unsigned long __cmpxchg_relaxed_small(volatile void *ptr, unsigned long old,
+					     unsigned long new, unsigned int size);
+extern unsigned long __cmpxchg_acquire_small(volatile void *ptr, unsigned long old,
+					     unsigned long new, unsigned int size);
+extern unsigned long __cmpxchg_release_small(volatile void *ptr, unsigned long old,
+					     unsigned long new, unsigned int size);
+extern unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
+				     unsigned long new, unsigned int size);
+
 #define __xchg_relaxed(ptr, new, size)					\
 ({									\
 	__typeof__(ptr) __ptr = (ptr);					\
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__xchg_relaxed_small(	\
+			(void*)ptr, (unsigned long)__new, size);	\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w %0, %2, %1\n"			\
@@ -58,6 +80,10 @@
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__xchg_acquire_small(	\
+			(void*)ptr, (unsigned long)__new, size);	\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w %0, %2, %1\n"			\
@@ -93,6 +119,10 @@
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__xchg_release_small(	\
+			(void*)ptr, (unsigned long)__new, size);	\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			RISCV_RELEASE_BARRIER				\
@@ -128,6 +158,10 @@
 	__typeof__(new) __new = (new);					\
 	__typeof__(*(ptr)) __ret;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__xchg_small(		\
+			(void*)ptr, (unsigned long)__new, size);	\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"	amoswap.w.aqrl %0, %2, %1\n"		\
@@ -179,6 +213,11 @@
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__cmpxchg_relaxed_small(	\
+			(void*)__ptr, (unsigned long)__old, 		\
+			(unsigned long)__new, size);			\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
@@ -223,6 +262,11 @@
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__cmpxchg_acquire_small(	\
+			(void*)__ptr, (unsigned long)__old, 		\
+			(unsigned long)__new, size);			\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
@@ -269,6 +313,11 @@
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__cmpxchg_release_small(	\
+			(void*)__ptr, (unsigned long)__old, 		\
+			(unsigned long)__new, size);			\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			RISCV_RELEASE_BARRIER				\
@@ -315,6 +364,11 @@
 	__typeof__(*(ptr)) __ret;					\
 	register unsigned int __rc;					\
 	switch (size) {							\
+	case 1:								\
+	case 2:								\
+		__ret = (__typeof__(*(ptr)))__cmpxchg_small(		\
+			(void*)__ptr, (unsigned long)__old, 		\
+			(unsigned long)__new, size);			\
 	case 4:								\
 		__asm__ __volatile__ (					\
 			"0:	lr.w %0, %2\n"				\
diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
index f13f7f276639..9f96ba34fd85 100644
--- a/arch/riscv/kernel/Makefile
+++ b/arch/riscv/kernel/Makefile
@@ -27,6 +27,7 @@ obj-y	+= riscv_ksyms.o
 obj-y	+= stacktrace.o
 obj-y	+= vdso.o
 obj-y	+= cacheinfo.o
+obj-y	+= cmpxchg.o
 obj-y	+= vdso/
 
 CFLAGS_setup.o := -mcmodel=medany
diff --git a/arch/riscv/kernel/cmpxchg.c b/arch/riscv/kernel/cmpxchg.c
new file mode 100644
index 000000000000..6208d81e4461
--- /dev/null
+++ b/arch/riscv/kernel/cmpxchg.c
@@ -0,0 +1,118 @@
+/*
+ * Copyright (C) 2017 Imagination Technologies
+ * Author: Paul Burton <paul.burton@mips.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include <linux/bitops.h>
+#include <asm/cmpxchg.h>
+
+#define TEMPLATE_XCGH_SMALL(__func,__op)				\
+unsigned long __func(volatile void *ptr, unsigned long new,		\
+		     unsigned int size)					\
+{									\
+	u32 old32, new32, load32, mask;					\
+	volatile u32 *ptr32;						\
+	unsigned int shift;						\
+									\
+	/* Check that ptr is naturally aligned */			\
+	WARN_ON((unsigned long)ptr & (size - 1));			\
+									\
+	/* Mask value to the correct size. */				\
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);			\
+	new &= mask;							\
+									\
+	/*								\
+	 * Calculate a shift & mask that corresponds to the value	\
+	 * we wish to exchange within the naturally aligned 4 byte 	\
+	 * integer that includes it.					\
+	 */								\
+	shift = (unsigned long)ptr & 0x3;				\
+	shift *= BITS_PER_BYTE;						\
+	mask <<= shift;							\
+									\
+	/*								\
+	 * Calculate a pointer to the naturally aligned 4 byte 		\
+	 * integer that includes our byte, and load its value.		\
+	 */								\
+	ptr32 = (volatile u32 *)((unsigned long)ptr & ~0x3);		\
+	load32 = *ptr32;						\
+									\
+	do {								\
+		old32 = load32;						\
+		new32 = (load32 & ~mask) | (new << shift);		\
+		load32 = __op(ptr32, old32, new32);			\
+	} while (load32 != old32);					\
+									\
+	return (load32 & mask) >> shift;				\
+}
+
+TEMPLATE_XCGH_SMALL(__xchg_small,cmpxchg)
+TEMPLATE_XCGH_SMALL(__xchg_relaxed_small,cmpxchg_relaxed)
+TEMPLATE_XCGH_SMALL(__xchg_acquire_small,cmpxchg_acquire)
+TEMPLATE_XCGH_SMALL(__xchg_release_small,cmpxchg_release)
+
+#define TEMPLATE_CMPXCGH_SMALL(__func,__op)				\
+unsigned long __func(volatile void *ptr, unsigned long old,		\
+		     unsigned long new, unsigned int size)		\
+{									\
+	u32 old32, new32, load32, mask;					\
+	volatile u32 *ptr32;						\
+	unsigned int shift;						\
+	u32 load;							\
+									\
+	/* Check that ptr is naturally aligned */			\
+	WARN_ON((unsigned long)ptr & (size - 1));			\
+									\
+	/* Mask inputs to the correct size. */				\
+	mask = GENMASK((size * BITS_PER_BYTE) - 1, 0);			\
+	old &= mask;							\
+	new &= mask;							\
+									\
+	/*								\
+	 * Calculate a shift & mask that corresponds to the value	\
+	 * we wish to exchange within the naturally aligned 4 byte 	\
+	 * integer that includes it.					\
+	 */								\
+	shift = (unsigned long)ptr & 0x3;				\
+	shift *= BITS_PER_BYTE;						\
+	mask <<= shift;							\
+									\
+	/*								\
+	 * Calculate a pointer to the naturally aligned 4 byte 		\
+	 * integer that includes our byte, and load its value.		\
+	 */								\
+	ptr32 = (volatile u32 *)((unsigned long)ptr & ~0x3);		\
+	load32 = *ptr32;						\
+									\
+	while (true) {							\
+		/*							\
+		 * Ensure the subword we want to exchange matches 	\
+		 * the expected old value, and if not then bail.	\
+		 */							\
+		load = (load32 & mask) >> shift;			\
+		if (load != old)					\
+			return load;					\
+									\
+		/*							\
+		 * Calculate the old & new values of the naturally	\
+		 * aligned 4 byte integer including the byte we want	\
+		 * to exchange. Attempt to exchange the old value	\
+		 * for the new value, and return if we succeed.		\
+		 */							\
+		old32 = (load32 & ~mask) | (old << shift);		\
+		new32 = (load32 & ~mask) | (new << shift);		\
+		load32 = __op(ptr32, old32, new32);			\
+		if (load32 == old32)					\
+			return old;					\
+	}								\
+}
+
+TEMPLATE_CMPXCGH_SMALL(__cmpxchg_small,cmpxchg)
+TEMPLATE_CMPXCGH_SMALL(__cmpxchg_relaxed_small,cmpxchg_relaxed)
+TEMPLATE_CMPXCGH_SMALL(__cmpxchg_acquire_small,cmpxchg_acquire)
+TEMPLATE_CMPXCGH_SMALL(__cmpxchg_release_small,cmpxchg_release)
-- 
2.17.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] RISC-V: convert custom spinlock/rwlock to generic qspinlock/qrwlock
  2019-02-11  4:38 [PATCH 0/3] RISC-V: use generic spinlock and rwlock Michael Clark
  2019-02-11  4:38 ` [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short Michael Clark
@ 2019-02-11  4:38 ` Michael Clark
  2019-02-11  4:38 ` [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values Michael Clark
  2 siblings, 0 replies; 11+ messages in thread
From: Michael Clark @ 2019-02-11  4:38 UTC (permalink / raw)
  To: Linux RISC-V; +Cc: RISC-V Patches, Michael Clark

Update the RISC-V port to use the generic qspinlock and qrwlock.

This patch requires support for xchg and cmpxchg for small words
(char and short) which are added by a previous patch.

Cc: RISC-V Patches <patches@groups.riscv.org>
Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
Signed-off-by: Michael Clark <michaeljclark@mac.com>
---
 arch/riscv/Kconfig                      |   2 +
 arch/riscv/include/asm/mcs_spinlock.h   |   7 ++
 arch/riscv/include/asm/qrwlock.h        |   8 ++
 arch/riscv/include/asm/qspinlock.h      |   8 ++
 arch/riscv/include/asm/spinlock.h       | 141 +-----------------------
 arch/riscv/include/asm/spinlock_types.h |  33 +-----
 6 files changed, 32 insertions(+), 167 deletions(-)
 create mode 100644 arch/riscv/include/asm/mcs_spinlock.h
 create mode 100644 arch/riscv/include/asm/qrwlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock.h

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 55da93f4e818..ac4c9f889c61 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -43,6 +43,8 @@ config RISCV
 	select RISCV_TIMER
 	select GENERIC_IRQ_MULTI_HANDLER
 	select ARCH_HAS_PTE_SPECIAL
+	select ARCH_USE_QUEUED_RWLOCKS
+	select ARCH_USE_QUEUED_SPINLOCKS
 
 config MMU
 	def_bool y
diff --git a/arch/riscv/include/asm/mcs_spinlock.h b/arch/riscv/include/asm/mcs_spinlock.h
new file mode 100644
index 000000000000..124dfe0a53d2
--- /dev/null
+++ b/arch/riscv/include/asm/mcs_spinlock.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_MCS_SPINLOCK_H
+#define _ASM_RISCV_MCS_SPINLOCK_H
+
+#include <asm-generic/mcs_spinlock.h>
+
+#endif /* _ASM_MCS_SPINLOCK_H */
diff --git a/arch/riscv/include/asm/qrwlock.h b/arch/riscv/include/asm/qrwlock.h
new file mode 100644
index 000000000000..5f8a1478f207
--- /dev/null
+++ b/arch/riscv/include/asm/qrwlock.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QRWLOCK_H
+#define _ASM_RISCV_QRWLOCK_H
+
+#include <asm-generic/qrwlock_types.h>
+#include <asm-generic/qrwlock.h>
+
+#endif /* _ASM_RISCV_QRWLOCK_H */
diff --git a/arch/riscv/include/asm/qspinlock.h b/arch/riscv/include/asm/qspinlock.h
new file mode 100644
index 000000000000..0c2c1ee22b77
--- /dev/null
+++ b/arch/riscv/include/asm/qspinlock.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_RISCV_QSPINLOCK_H
+#define _ASM_RISCV_QSPINLOCK_H
+
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qspinlock.h>
+
+#endif /* _ASM_RISCV_QSPINLOCK_H */
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index 8eb26d1ede81..fc405eeb8163 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -1,143 +1,8 @@
-/*
- * Copyright (C) 2015 Regents of the University of California
- * Copyright (C) 2017 SiFive
- *
- *   This program is free software; you can redistribute it and/or
- *   modify it under the terms of the GNU General Public License
- *   as published by the Free Software Foundation, version 2.
- *
- *   This program is distributed in the hope that it will be useful,
- *   but WITHOUT ANY WARRANTY; without even the implied warranty of
- *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *   GNU General Public License for more details.
- */
-
+/* SPDX-License-Identifier: GPL-2.0 */
 #ifndef _ASM_RISCV_SPINLOCK_H
 #define _ASM_RISCV_SPINLOCK_H
 
-#include <linux/kernel.h>
-#include <asm/current.h>
-#include <asm/fence.h>
-
-/*
- * Simple spin lock operations.  These provide no fairness guarantees.
- */
-
-/* FIXME: Replace this with a ticket lock, like MIPS. */
-
-#define arch_spin_is_locked(x)	(READ_ONCE((x)->lock) != 0)
-
-static inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
-	smp_store_release(&lock->lock, 0);
-}
-
-static inline int arch_spin_trylock(arch_spinlock_t *lock)
-{
-	int tmp = 1, busy;
-
-	__asm__ __volatile__ (
-		"	amoswap.w %0, %2, %1\n"
-		RISCV_ACQUIRE_BARRIER
-		: "=r" (busy), "+A" (lock->lock)
-		: "r" (tmp)
-		: "memory");
-
-	return !busy;
-}
-
-static inline void arch_spin_lock(arch_spinlock_t *lock)
-{
-	while (1) {
-		if (arch_spin_is_locked(lock))
-			continue;
-
-		if (arch_spin_trylock(lock))
-			break;
-	}
-}
-
-/***********************************************************/
-
-static inline void arch_read_lock(arch_rwlock_t *lock)
-{
-	int tmp;
-
-	__asm__ __volatile__(
-		"1:	lr.w	%1, %0\n"
-		"	bltz	%1, 1b\n"
-		"	addi	%1, %1, 1\n"
-		"	sc.w	%1, %1, %0\n"
-		"	bnez	%1, 1b\n"
-		RISCV_ACQUIRE_BARRIER
-		: "+A" (lock->lock), "=&r" (tmp)
-		:: "memory");
-}
-
-static inline void arch_write_lock(arch_rwlock_t *lock)
-{
-	int tmp;
-
-	__asm__ __volatile__(
-		"1:	lr.w	%1, %0\n"
-		"	bnez	%1, 1b\n"
-		"	li	%1, -1\n"
-		"	sc.w	%1, %1, %0\n"
-		"	bnez	%1, 1b\n"
-		RISCV_ACQUIRE_BARRIER
-		: "+A" (lock->lock), "=&r" (tmp)
-		:: "memory");
-}
-
-static inline int arch_read_trylock(arch_rwlock_t *lock)
-{
-	int busy;
-
-	__asm__ __volatile__(
-		"1:	lr.w	%1, %0\n"
-		"	bltz	%1, 1f\n"
-		"	addi	%1, %1, 1\n"
-		"	sc.w	%1, %1, %0\n"
-		"	bnez	%1, 1b\n"
-		RISCV_ACQUIRE_BARRIER
-		"1:\n"
-		: "+A" (lock->lock), "=&r" (busy)
-		:: "memory");
-
-	return !busy;
-}
-
-static inline int arch_write_trylock(arch_rwlock_t *lock)
-{
-	int busy;
-
-	__asm__ __volatile__(
-		"1:	lr.w	%1, %0\n"
-		"	bnez	%1, 1f\n"
-		"	li	%1, -1\n"
-		"	sc.w	%1, %1, %0\n"
-		"	bnez	%1, 1b\n"
-		RISCV_ACQUIRE_BARRIER
-		"1:\n"
-		: "+A" (lock->lock), "=&r" (busy)
-		:: "memory");
-
-	return !busy;
-}
-
-static inline void arch_read_unlock(arch_rwlock_t *lock)
-{
-	__asm__ __volatile__(
-		RISCV_RELEASE_BARRIER
-		"	amoadd.w x0, %1, %0\n"
-		: "+A" (lock->lock)
-		: "r" (-1)
-		: "memory");
-}
-
-static inline void arch_write_unlock(arch_rwlock_t *lock)
-{
-	smp_store_release(&lock->lock, 0);
-}
+#include <asm/qrwlock.h>
+#include <asm/qspinlock.h>
 
 #endif /* _ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/include/asm/spinlock_types.h b/arch/riscv/include/asm/spinlock_types.h
index 83ac4ac9e2ac..fc1cbf61d746 100644
--- a/arch/riscv/include/asm/spinlock_types.h
+++ b/arch/riscv/include/asm/spinlock_types.h
@@ -1,33 +1,8 @@
-/*
- * Copyright (C) 2015 Regents of the University of California
- *
- *   This program is free software; you can redistribute it and/or
- *   modify it under the terms of the GNU General Public License
- *   as published by the Free Software Foundation, version 2.
- *
- *   This program is distributed in the hope that it will be useful,
- *   but WITHOUT ANY WARRANTY; without even the implied warranty of
- *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *   GNU General Public License for more details.
- */
-
+/* SPDX-License-Identifier: GPL-2.0 */
 #ifndef _ASM_RISCV_SPINLOCK_TYPES_H
 #define _ASM_RISCV_SPINLOCK_TYPES_H
 
-#ifndef __LINUX_SPINLOCK_TYPES_H
-# error "please don't include this file directly"
-#endif
-
-typedef struct {
-	volatile unsigned int lock;
-} arch_spinlock_t;
-
-#define __ARCH_SPIN_LOCK_UNLOCKED	{ 0 }
-
-typedef struct {
-	volatile unsigned int lock;
-} arch_rwlock_t;
-
-#define __ARCH_RW_LOCK_UNLOCKED		{ 0 }
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>
 
-#endif
+#endif /* _ASM_RISCV_SPINLOCK_TYPES_H */
-- 
2.17.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values
  2019-02-11  4:38 [PATCH 0/3] RISC-V: use generic spinlock and rwlock Michael Clark
  2019-02-11  4:38 ` [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short Michael Clark
  2019-02-11  4:38 ` [PATCH 2/3] RISC-V: convert custom spinlock/rwlock to generic qspinlock/qrwlock Michael Clark
@ 2019-02-11  4:38 ` Michael Clark
  2019-02-11 12:37   ` Jonas Gorski
  2019-02-11 20:03     ` Paul Burton
  2 siblings, 2 replies; 11+ messages in thread
From: Michael Clark @ 2019-02-11  4:38 UTC (permalink / raw)
  To: Linux RISC-V; +Cc: RISC-V Patches, Linux MIPS, Michael Clark

__cmpxchg_small erroneously uses u8 for load comparison which can
be either char or short. This patch changes the local varialble to
u32 which is sufficiently sized, as the loaded value is already
masked and shifted appropriately. Using an integer size avoids
any unnecessary canonicalization from use of non native widths.

This patch is part of a series that adapts the MIPS small word
atomics code for xchg and cmpxchg on short and char to RISC-V.

Cc: RISC-V Patches <patches@groups.riscv.org>
Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
Cc: Linux MIPS <linux-mips@linux-mips.org>
Signed-off-by: Michael Clark <michaeljclark@mac.com>
---
 arch/mips/kernel/cmpxchg.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/kernel/cmpxchg.c b/arch/mips/kernel/cmpxchg.c
index 0b9535bc2c53..1169958fd748 100644
--- a/arch/mips/kernel/cmpxchg.c
+++ b/arch/mips/kernel/cmpxchg.c
@@ -57,7 +57,7 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
 	u32 mask, old32, new32, load32;
 	volatile u32 *ptr32;
 	unsigned int shift;
-	u8 load;
+	u32 load;
 
 	/* Check that ptr is naturally aligned */
 	WARN_ON((unsigned long)ptr & (size - 1));
-- 
2.17.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values
  2019-02-11  4:38 ` [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values Michael Clark
@ 2019-02-11 12:37   ` Jonas Gorski
  2019-02-11 20:20     ` Paul Burton
  2019-02-11 20:03     ` Paul Burton
  1 sibling, 1 reply; 11+ messages in thread
From: Jonas Gorski @ 2019-02-11 12:37 UTC (permalink / raw)
  To: Michael Clark; +Cc: RISC-V Patches, Linux RISC-V, Linux MIPS

Hi,

On Mon, 11 Feb 2019 at 05:39, Michael Clark <michaeljclark@mac.com> wrote:
>
> __cmpxchg_small erroneously uses u8 for load comparison which can
> be either char or short. This patch changes the local varialble to

varialble => variable

> u32 which is sufficiently sized, as the loaded value is already
> masked and shifted appropriately. Using an integer size avoids
> any unnecessary canonicalization from use of non native widths.
>
> This patch is part of a series that adapts the MIPS small word
> atomics code for xchg and cmpxchg on short and char to RISC-V.
>
> Cc: RISC-V Patches <patches@groups.riscv.org>
> Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
> Cc: Linux MIPS <linux-mips@linux-mips.org>
> Signed-off-by: Michael Clark <michaeljclark@mac.com>
> ---
>  arch/mips/kernel/cmpxchg.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/mips/kernel/cmpxchg.c b/arch/mips/kernel/cmpxchg.c
> index 0b9535bc2c53..1169958fd748 100644
> --- a/arch/mips/kernel/cmpxchg.c
> +++ b/arch/mips/kernel/cmpxchg.c
> @@ -57,7 +57,7 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
>         u32 mask, old32, new32, load32;
>         volatile u32 *ptr32;
>         unsigned int shift;
> -       u8 load;
> +       u32 load;

There already is a u32 line above, so maybe move it there.

Also reading the code to understand this, isn't the old code broken
for cmpxchg_small calls for 16-bit variables, where old is > 0xff?

because it does later

        /*
         * Ensure the byte we want to exchange matches the expected
         * old value, and if not then bail.
         */
        load = (load32 & mask) >> shift;
        if (load != old)
            return load;

and if load is a u8, it can never be old if old contains a larger
value than what can fit in a u8.

After re-reading your commit log, it seems you say something similar,
but it wasn't quite obvious for me that this means the function is
basically broken for short values where the old value does not fit in
u8.

So this should have an appropriate "Fixes:" tag. And Cc: stable. Seems
like quite a serious issue to me.


Regards
Jonas


Jonas

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values
  2019-02-11  4:38 ` [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values Michael Clark
@ 2019-02-11 20:03     ` Paul Burton
  2019-02-11 20:03     ` Paul Burton
  1 sibling, 0 replies; 11+ messages in thread
From: Paul Burton @ 2019-02-11 20:03 UTC (permalink / raw)
  To: Michael Clark
  Cc: Linux RISC-V, Michael Clark, RISC-V Patches, Linux MIPS, linux-mips

Hello,

Michael Clark wrote:
> __cmpxchg_small erroneously uses u8 for load comparison which can
> be either char or short. This patch changes the local varialble to
> u32 which is sufficiently sized, as the loaded value is already
> masked and shifted appropriately. Using an integer size avoids
> any unnecessary canonicalization from use of non native widths.
> 
> This patch is part of a series that adapts the MIPS small word
> atomics code for xchg and cmpxchg on short and char to RISC-V.
> 
> Cc: RISC-V Patches <patches@groups.riscv.org>
> Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
> Cc: Linux MIPS <linux-mips@linux-mips.org>
> Signed-off-by: Michael Clark <michaeljclark@mac.com>

Applied to mips-fixes.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values
@ 2019-02-11 20:03     ` Paul Burton
  0 siblings, 0 replies; 11+ messages in thread
From: Paul Burton @ 2019-02-11 20:03 UTC (permalink / raw)
  To: Michael Clark
  Cc: RISC-V Patches, Linux RISC-V, Linux MIPS, linux-mips, Michael Clark

Hello,

Michael Clark wrote:
> __cmpxchg_small erroneously uses u8 for load comparison which can
> be either char or short. This patch changes the local varialble to
> u32 which is sufficiently sized, as the loaded value is already
> masked and shifted appropriately. Using an integer size avoids
> any unnecessary canonicalization from use of non native widths.
> 
> This patch is part of a series that adapts the MIPS small word
> atomics code for xchg and cmpxchg on short and char to RISC-V.
> 
> Cc: RISC-V Patches <patches@groups.riscv.org>
> Cc: Linux RISC-V <linux-riscv@lists.infradead.org>
> Cc: Linux MIPS <linux-mips@linux-mips.org>
> Signed-off-by: Michael Clark <michaeljclark@mac.com>

Applied to mips-fixes.

Thanks,
    Paul

[ This message was auto-generated; if you believe anything is incorrect
  then please email paul.burton@mips.com to report it. ]

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values
  2019-02-11 12:37   ` Jonas Gorski
@ 2019-02-11 20:20     ` Paul Burton
  2019-02-24  0:09       ` Michael Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Paul Burton @ 2019-02-11 20:20 UTC (permalink / raw)
  To: Jonas Gorski, Michael Clark; +Cc: RISC-V Patches, Linux RISC-V, Linux MIPS

Hello,

On Mon, Feb 11, 2019 at 01:37:08PM +0100, Jonas Gorski wrote:
> On Mon, 11 Feb 2019 at 05:39, Michael Clark <michaeljclark@mac.com> wrote:
> > diff --git a/arch/mips/kernel/cmpxchg.c b/arch/mips/kernel/cmpxchg.c
> > index 0b9535bc2c53..1169958fd748 100644
> > --- a/arch/mips/kernel/cmpxchg.c
> > +++ b/arch/mips/kernel/cmpxchg.c
> > @@ -57,7 +57,7 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
> >         u32 mask, old32, new32, load32;
> >         volatile u32 *ptr32;
> >         unsigned int shift;
> > -       u8 load;
> > +       u32 load;
> 
> There already is a u32 line above, so maybe move it there.
> 
> Also reading the code to understand this, isn't the old code broken
> for cmpxchg_small calls for 16-bit variables, where old is > 0xff?
> 
> because it does later
> 
>         /*
>          * Ensure the byte we want to exchange matches the expected
>          * old value, and if not then bail.
>          */
>         load = (load32 & mask) >> shift;
>         if (load != old)
>             return load;
> 
> and if load is a u8, it can never be old if old contains a larger
> value than what can fit in a u8.
> 
> After re-reading your commit log, it seems you say something similar,
> but it wasn't quite obvious for me that this means the function is
> basically broken for short values where the old value does not fit in
> u8.
> 
> So this should have an appropriate "Fixes:" tag. And Cc: stable. Seems
> like quite a serious issue to me.

It could be serious if used, though in practice this support was added
to support qrwlock which only really needed 8-bit support at the time.
Since then commit d13316614633 ("locking/qrwlock: Prevent slowpath
writers getting held up by fastpath") removed even that.

But still, yes it's clearly a nasty little bug if anyone does try to use
a 16-bit cmpxchg() & I've marked it for stable backport whilst applying.

Thanks for fixing this Michael :)

Paul

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short
  2019-02-11  4:38 ` [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short Michael Clark
@ 2019-02-12  7:17   ` Christoph Hellwig
  2019-02-24 21:03     ` Michael Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2019-02-12  7:17 UTC (permalink / raw)
  To: Michael Clark; +Cc: RISC-V Patches, Linux RISC-V

>  
> +extern unsigned long __xchg_relaxed_small(volatile void *ptr, unsigned long new,
> +					  unsigned int size);
> +extern unsigned long __xchg_acquire_small(volatile void *ptr, unsigned long new,

No need for any of the externs on function declarations.

> +++ b/arch/riscv/kernel/cmpxchg.c
> @@ -0,0 +1,118 @@
> +/*
> + * Copyright (C) 2017 Imagination Technologies
> + * Author: Paul Burton <paul.burton@mips.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */

Please use an SPDX tag instead of the GPL boilerplate.

Also this code seems to be entirely generic and very similar to the
mips code.  Did you consider making it a generic library routine
in lib/ instead?

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values
  2019-02-11 20:20     ` Paul Burton
@ 2019-02-24  0:09       ` Michael Clark
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Clark @ 2019-02-24  0:09 UTC (permalink / raw)
  To: Paul Burton, Jonas Gorski
  Cc: RISC-V Patches, Linux RISC-V, Andrew Waterman, Linux MIPS

Hi Paul,

On 12/02/19 9:20 AM, Paul Burton wrote:
> Hello,
> 
> On Mon, Feb 11, 2019 at 01:37:08PM +0100, Jonas Gorski wrote:
>> On Mon, 11 Feb 2019 at 05:39, Michael Clark <michaeljclark@mac.com> wrote:
>>> diff --git a/arch/mips/kernel/cmpxchg.c b/arch/mips/kernel/cmpxchg.c
>>> index 0b9535bc2c53..1169958fd748 100644
>>> --- a/arch/mips/kernel/cmpxchg.c
>>> +++ b/arch/mips/kernel/cmpxchg.c
>>> @@ -57,7 +57,7 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
>>>          u32 mask, old32, new32, load32;
>>>          volatile u32 *ptr32;
>>>          unsigned int shift;
>>> -       u8 load;
>>> +       u32 load;
>>
>> There already is a u32 line above, so maybe move it there.
>>
>> Also reading the code to understand this, isn't the old code broken
>> for cmpxchg_small calls for 16-bit variables, where old is > 0xff?
>>
>> because it does later
>>
>>          /*
>>           * Ensure the byte we want to exchange matches the expected
>>           * old value, and if not then bail.
>>           */
>>          load = (load32 & mask) >> shift;
>>          if (load != old)
>>              return load;
>>
>> and if load is a u8, it can never be old if old contains a larger
>> value than what can fit in a u8.
>>
>> After re-reading your commit log, it seems you say something similar,
>> but it wasn't quite obvious for me that this means the function is
>> basically broken for short values where the old value does not fit in
>> u8.
>>
>> So this should have an appropriate "Fixes:" tag. And Cc: stable. Seems
>> like quite a serious issue to me.

Okay. I was pretty sure it was a bug but I am not sure about the 
conventions for Linux fixes.

> It could be serious if used, though in practice this support was added
> to support qrwlock which only really needed 8-bit support at the time.
> Since then commit d13316614633 ("locking/qrwlock: Prevent slowpath
> writers getting held up by fastpath") removed even that.

Yes. I suspected it was a latent bug. Truncating shorts in compare and 
swap would could show up with unusual values.

> But still, yes it's clearly a nasty little bug if anyone does try to use
> a 16-bit cmpxchg() & I've marked it for stable backport whilst applying.
> 
> Thanks for fixing this Michael :)

Appreciated. Yes. Thanks for taking the time to verify it, although it 
is quite an obvious fix it could be a big time suck if one encountered 
it in the wild as one wouldn't expect a broken intrinsic.

Sorry for the lag in replying. At the time it appeared somewhat like 
throwing text plus code over the fence, as part of discovery for a 
submission regarding RISC-V atomics. I had full intention of circling 
back, just that I am not good with many small emails.

This has prompted me to revise a fair spinlock implementation (that fits 
into 32-bits)


.. RISC-V tangent...

Related by parent thread. I was looking into ticket spin locks for 
threads on bare metal, which prompted this patch in the first place.

While the Linux asm-generic ticket spinlock is fair; LR/SC for small 
word atomics requires significantly more instructions, thus has a cycle 
penalty for fairness vs amo.add. The problem, however, on RISC-V is that 
a fair spinlock using amo.add for head and tail would need to be 64-bits 
in size due to the 32-bit minimum atomic width for atomic ops. For one 
per CPU structure on a large system, this is significant memory.

A compromise on code size and compactness of structure, would be 
amoadd.w of 0x0001_0000 for head acquire and LR/SC for tail release. I 
chose 2^16 because 255 cores seems a bit small present day given folk 
are fitting more than a thousand RISC-V cores on an FPGA, and one 
assumes 4096 is quite plausible. Anyhow, here is a 32-bit ticket 
spinlock with support for 65,535 cores (needs verification):

spin_lock:
     lui     a2,0x10
     amoadd.w a5,a5,(a0)
     srliw   a4,a5,0x10
2:  slliw   a5,a5,0x10
     srliw   a5,a5,0x10
     bne     a5,a4,3f
     ret
3:  lw      a5,0(a0)
     fence   r,rw
     j       2b

spin_trylock:
     lui     a5,0x10
     lr.w.aq a4,(a0)
     addw    a5,a5,a4
     slliw   a3,a5,0x10
     srliw   a3,a3,0x10
     srliw   a4,a5,0x10
     beq     a3,a4,2f
1:  li      a0,0
     ret
2:  sc.w.rl a4,a5,(a0)
     bnez    a4,1b
     fence   r,rw
     li      a0,1
     ret

spin_unlock:
     fence   rw,w
1:  lr.w.aq a4,(a0)
     srliw   a5,a4,0x10
     addiw   a4,a4,1
     slliw   a4,a4,0x10
     srliw   a4,a4,0x10
     slliw   a5,a5,0x10
     or      a5,a5,a4
     sc.w.rl a4,a5,(a0)
     bnez    a5,1b
     ret

We could keep the current simple (but unfair) spinlock. I do suspect 
unfairness will not scale so whatever is done, it may end up needing to 
be a config option. I actually am fond of the idea of a RISC-V Atomic 
Extension V2 in RISC-V V3.0 or V4.0 with 48-bit instructions. A 6-byte 
instruction would be quite a nice compromise.

It seems that the hybrid approach (above) using amoadd.w for the head, 
i.e. fast ticket number acquisition followed by spin is logical. This 
balances the code size penalty for lock/unlock when trying to fit a 
scalable ticket spinlock into 32-bits If one swaps head and tail, then 
lock acquisition has a high cost and lock release becomes trivial, which 
seems wrong. spin_trylock necessarily must use LR/SC as it needs to 
conditionally acquire a ticket.

This is actually RISC-V generic so I probably should post it somewhere 
where folk are interested in RISC-V software. I think if we come up with 
simple lock, it should be usable in BSD or Linux GPL, so consider any of 
these fragments as public domain. Verification is assumed to be applied. 
The previous patches where tested in QEMU, but this asm is part compiler 
emitted and part hand-coded (compiler is not yet smart enough to avoid 
illegal branches as it doesn't parse LR/SC - that's possibly an arm 
issue also i.e. other RISC; so just sharing thoughts...).

Michael.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short
  2019-02-12  7:17   ` Christoph Hellwig
@ 2019-02-24 21:03     ` Michael Clark
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Clark @ 2019-02-24 21:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: RISC-V Patches, Linux RISC-V



On 12/02/19 8:17 PM, Christoph Hellwig wrote:
>>   
>> +extern unsigned long __xchg_relaxed_small(volatile void *ptr, unsigned long new,
>> +					  unsigned int size);
>> +extern unsigned long __xchg_acquire_small(volatile void *ptr, unsigned long new,
> 
> No need for any of the externs on function declarations.
> 
>> +++ b/arch/riscv/kernel/cmpxchg.c
>> @@ -0,0 +1,118 @@
>> +/*
>> + * Copyright (C) 2017 Imagination Technologies
>> + * Author: Paul Burton <paul.burton@mips.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License as published by the
>> + * Free Software Foundation;  either version 2 of the  License, or (at your
>> + * option) any later version.
>> + */
> 
> Please use an SPDX tag instead of the GPL boilerplate.

Okay.

> Also this code seems to be entirely generic and very similar to the
> mips code.  Did you consider making it a generic library routine
> in lib/ instead?

Indeed. After your feedback, I agree. There is no reason why small word 
atomic support cannot be generalized, as it is, it is written in terms 
of the underlying primitives and is not MIPS specific. The addition, is 
the addition of; relaxed, acquire, release; and they can be made 
optional if this code is made generic. We have to think how the arch 
instantiates the templates, in the C template idiom and that likely 
requires some Kconfig magic...

In the mean time, I've absorbed feedback regarding amoadd, and it may be 
the case that we end up with an arch specific spinlock for RISC-V. I 
have not research rwlocks yet. Below is the sample asm snipped from 
another email. Note: this code has not been tested yet...

spin_lock:
     lui     a2,0x10
     amoadd.w a5,a5,(a0)
     srliw   a4,a5,0x10
2:  slliw   a5,a5,0x10
     srliw   a5,a5,0x10
     bne     a5,a4,3f
     ret
3:  lw      a5,0(a0)
     fence   r,rw
     j       2b

spin_trylock:
     lui     a5,0x10
     lr.w.aq a4,(a0)
     addw    a5,a5,a4
     slliw   a3,a5,0x10
     srliw   a3,a3,0x10
     srliw   a4,a5,0x10
     beq     a3,a4,2f
1:  li      a0,0
     ret
2:  sc.w.rl a4,a5,(a0)
     bnez    a4,1b
     fence   r,rw
     li      a0,1
     ret

spin_unlock:
     fence   rw,w
1:  lr.w.aq a4,(a0)
     srliw   a5,a4,0x10
     addiw   a4,a4,1
     slliw   a4,a4,0x10
     srliw   a4,a4,0x10
     slliw   a5,a5,0x10
     or      a5,a5,a4
     sc.w.rl a4,a5,(a0)
     bnez    a5,1b
     ret

Here is the thread where I happened to be thinking at the time about (a 
bit of a tangent, but as we know, acos and asin can be formed with atan)

- https://patchwork.kernel.org/patch/10805093/

Reading the per-cpu amoadd thread made me consider a ticket spinlock 
that uses amoadd for lock/acquire and LR/SC for trylock and 
unlock/release. The idea is a compromise between fairness, code size and 
lock structure size. In this case we have a 32-bit spinlock and amoadd 
0x0001_0000 is used for head increment/acquire, while release is 
slightly more costly, but it balances out the code size quite nicely, 
compared to the previous 32-bit ticket spinlock code that I have been 
experimenting with (independently from Linux asm-generic qspinlock).

I am interested in fair locks, and would also like to try them in an OoO 
simulator that explores the constraints on the bounds of the RISC-V 
memory model, such as testing the correctness for the extent of what can 
be executed out of order. There is quite a bit of work to get one's head 
around this, because I would like to see possible alternate executions. 
We can't really do this with QEMU nor spike. Work such as this has 
previously been done with black-boxes, however I would like OoO 
simulation to be part of general simulation infrastructure for RISC-V

(although I don't currently have funding for that as there is a 
reasonably significant effort involved; small steps; like actually 
getting different lock variants to test...).

Michael.

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-24 21:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-11  4:38 [PATCH 0/3] RISC-V: use generic spinlock and rwlock Michael Clark
2019-02-11  4:38 ` [PATCH 1/3] RISC-V: implement xchg_small and cmpxchg_small for char and short Michael Clark
2019-02-12  7:17   ` Christoph Hellwig
2019-02-24 21:03     ` Michael Clark
2019-02-11  4:38 ` [PATCH 2/3] RISC-V: convert custom spinlock/rwlock to generic qspinlock/qrwlock Michael Clark
2019-02-11  4:38 ` [PATCH 3/3] MIPS: fix truncation in __cmpxchg_small for short values Michael Clark
2019-02-11 12:37   ` Jonas Gorski
2019-02-11 20:20     ` Paul Burton
2019-02-24  0:09       ` Michael Clark
2019-02-11 20:03   ` Paul Burton
2019-02-11 20:03     ` Paul Burton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.