* csky: smp_mb__after_spinlock @ 2020-08-05 10:41 peterz 2020-08-05 10:44 ` peterz 2020-08-07 0:22 ` Guo Ren 0 siblings, 2 replies; 5+ messages in thread From: peterz @ 2020-08-05 10:41 UTC (permalink / raw) To: ren_guo; +Cc: linux-kernel, linux-csky, mathieu.desnoyers, Will Deacon Hi, While doing an audit of smp_mb__after_spinlock, I found that csky defines it, why? CSKY only has smp_mb(), it doesn't override __atomic_acquire_fence or otherwise special cases it's atomic*_acquire() primitives. It has an explicit smp_mb() in its arch_spin_lock(). --- diff --git a/arch/csky/include/asm/spinlock.h b/arch/csky/include/asm/spinlock.h index 7cf3f2b34cea..22a05caf2d18 100644 --- a/arch/csky/include/asm/spinlock.h +++ b/arch/csky/include/asm/spinlock.h @@ -88,9 +88,6 @@ static inline int arch_spin_is_contended(arch_spinlock_t *lock) #include <asm/qrwlock.h> -/* See include/linux/spinlock.h */ -#define smp_mb__after_spinlock() smp_mb() - #else /* CONFIG_QUEUED_RWLOCKS */ /* ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: csky: smp_mb__after_spinlock 2020-08-05 10:41 csky: smp_mb__after_spinlock peterz @ 2020-08-05 10:44 ` peterz 2020-08-07 0:23 ` Guo Ren 2020-08-07 0:22 ` Guo Ren 1 sibling, 1 reply; 5+ messages in thread From: peterz @ 2020-08-05 10:44 UTC (permalink / raw) To: ren_guo; +Cc: linux-kernel, linux-csky, mathieu.desnoyers, Will Deacon On Wed, Aug 05, 2020 at 12:41:46PM +0200, peterz@infradead.org wrote: > Hi, > > While doing an audit of smp_mb__after_spinlock, I found that csky > defines it, why? > > CSKY only has smp_mb(), it doesn't override __atomic_acquire_fence or > otherwise special cases it's atomic*_acquire() primitives. It has an > explicit smp_mb() in its arch_spin_lock(). Also, why have two implementations of all the locking ? --- diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig index bd31ab12f77d..332738e93e57 100644 --- a/arch/csky/Kconfig +++ b/arch/csky/Kconfig @@ -7,7 +7,7 @@ config CSKY select ARCH_HAS_SYNC_DMA_FOR_CPU select ARCH_HAS_SYNC_DMA_FOR_DEVICE select ARCH_USE_BUILTIN_BSWAP - select ARCH_USE_QUEUED_RWLOCKS if NR_CPUS>2 + select ARCH_USE_QUEUED_RWLOCKS select ARCH_WANT_FRAME_POINTERS if !CPU_CK610 select COMMON_CLK select CLKSRC_MMIO diff --git a/arch/csky/include/asm/spinlock.h b/arch/csky/include/asm/spinlock.h index 7cf3f2b34cea..69f5aa249c5f 100644 --- a/arch/csky/include/asm/spinlock.h +++ b/arch/csky/include/asm/spinlock.h @@ -6,8 +6,6 @@ #include <linux/spinlock_types.h> #include <asm/barrier.h> -#ifdef CONFIG_QUEUED_RWLOCKS - /* * Ticket-based spin-locking. */ @@ -88,169 +86,4 @@ static inline int arch_spin_is_contended(arch_spinlock_t *lock) #include <asm/qrwlock.h> -/* See include/linux/spinlock.h */ -#define smp_mb__after_spinlock() smp_mb() - -#else /* CONFIG_QUEUED_RWLOCKS */ - -/* - * Test-and-set spin-locking. - */ -static inline void arch_spin_lock(arch_spinlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - asm volatile ( - "1: ldex.w %0, (%1) \n" - " bnez %0, 1b \n" - " movi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); - smp_mb(); -} - -static inline void arch_spin_unlock(arch_spinlock_t *lock) -{ - smp_mb(); - WRITE_ONCE(lock->lock, 0); -} - -static inline int arch_spin_trylock(arch_spinlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - asm volatile ( - "1: ldex.w %0, (%1) \n" - " bnez %0, 2f \n" - " movi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - " movi %0, 0 \n" - "2: \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); - - if (!tmp) - smp_mb(); - - return !tmp; -} - -#define arch_spin_is_locked(x) (READ_ONCE((x)->lock) != 0) - -/* - * read lock/unlock/trylock - */ -static inline void arch_read_lock(arch_rwlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - asm volatile ( - "1: ldex.w %0, (%1) \n" - " blz %0, 1b \n" - " addi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); - smp_mb(); -} - -static inline void arch_read_unlock(arch_rwlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - smp_mb(); - asm volatile ( - "1: ldex.w %0, (%1) \n" - " subi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); -} - -static inline int arch_read_trylock(arch_rwlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - asm volatile ( - "1: ldex.w %0, (%1) \n" - " blz %0, 2f \n" - " addi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - " movi %0, 0 \n" - "2: \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); - - if (!tmp) - smp_mb(); - - return !tmp; -} - -/* - * write lock/unlock/trylock - */ -static inline void arch_write_lock(arch_rwlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - asm volatile ( - "1: ldex.w %0, (%1) \n" - " bnez %0, 1b \n" - " subi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); - smp_mb(); -} - -static inline void arch_write_unlock(arch_rwlock_t *lock) -{ - smp_mb(); - WRITE_ONCE(lock->lock, 0); -} - -static inline int arch_write_trylock(arch_rwlock_t *lock) -{ - u32 *p = &lock->lock; - u32 tmp; - - asm volatile ( - "1: ldex.w %0, (%1) \n" - " bnez %0, 2f \n" - " subi %0, 1 \n" - " stex.w %0, (%1) \n" - " bez %0, 1b \n" - " movi %0, 0 \n" - "2: \n" - : "=&r" (tmp) - : "r"(p) - : "cc"); - - if (!tmp) - smp_mb(); - - return !tmp; -} - -#endif /* CONFIG_QUEUED_RWLOCKS */ #endif /* __ASM_CSKY_SPINLOCK_H */ ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: csky: smp_mb__after_spinlock 2020-08-05 10:44 ` peterz @ 2020-08-07 0:23 ` Guo Ren 0 siblings, 0 replies; 5+ messages in thread From: Guo Ren @ 2020-08-07 0:23 UTC (permalink / raw) To: Peter Zijlstra Cc: Ren Guo, Linux Kernel Mailing List, linux-csky, mathieu.desnoyers, Will Deacon Acked-by: Guo Ren <guoren@kernel.org> On Thu, Aug 6, 2020 at 3:55 AM <peterz@infradead.org> wrote: > > On Wed, Aug 05, 2020 at 12:41:46PM +0200, peterz@infradead.org wrote: > > Hi, > > > > While doing an audit of smp_mb__after_spinlock, I found that csky > > defines it, why? > > > > CSKY only has smp_mb(), it doesn't override __atomic_acquire_fence or > > otherwise special cases it's atomic*_acquire() primitives. It has an > > explicit smp_mb() in its arch_spin_lock(). > > Also, why have two implementations of all the locking ? > > --- > diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig > index bd31ab12f77d..332738e93e57 100644 > --- a/arch/csky/Kconfig > +++ b/arch/csky/Kconfig > @@ -7,7 +7,7 @@ config CSKY > select ARCH_HAS_SYNC_DMA_FOR_CPU > select ARCH_HAS_SYNC_DMA_FOR_DEVICE > select ARCH_USE_BUILTIN_BSWAP > - select ARCH_USE_QUEUED_RWLOCKS if NR_CPUS>2 > + select ARCH_USE_QUEUED_RWLOCKS > select ARCH_WANT_FRAME_POINTERS if !CPU_CK610 > select COMMON_CLK > select CLKSRC_MMIO > diff --git a/arch/csky/include/asm/spinlock.h b/arch/csky/include/asm/spinlock.h > index 7cf3f2b34cea..69f5aa249c5f 100644 > --- a/arch/csky/include/asm/spinlock.h > +++ b/arch/csky/include/asm/spinlock.h > @@ -6,8 +6,6 @@ > #include <linux/spinlock_types.h> > #include <asm/barrier.h> > > -#ifdef CONFIG_QUEUED_RWLOCKS > - > /* > * Ticket-based spin-locking. > */ > @@ -88,169 +86,4 @@ static inline int arch_spin_is_contended(arch_spinlock_t *lock) > > #include <asm/qrwlock.h> > > -/* See include/linux/spinlock.h */ > -#define smp_mb__after_spinlock() smp_mb() > - > -#else /* CONFIG_QUEUED_RWLOCKS */ > - > -/* > - * Test-and-set spin-locking. > - */ > -static inline void arch_spin_lock(arch_spinlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " bnez %0, 1b \n" > - " movi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > - smp_mb(); > -} > - > -static inline void arch_spin_unlock(arch_spinlock_t *lock) > -{ > - smp_mb(); > - WRITE_ONCE(lock->lock, 0); > -} > - > -static inline int arch_spin_trylock(arch_spinlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " bnez %0, 2f \n" > - " movi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - " movi %0, 0 \n" > - "2: \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > - > - if (!tmp) > - smp_mb(); > - > - return !tmp; > -} > - > -#define arch_spin_is_locked(x) (READ_ONCE((x)->lock) != 0) > - > -/* > - * read lock/unlock/trylock > - */ > -static inline void arch_read_lock(arch_rwlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " blz %0, 1b \n" > - " addi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > - smp_mb(); > -} > - > -static inline void arch_read_unlock(arch_rwlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - smp_mb(); > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " subi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > -} > - > -static inline int arch_read_trylock(arch_rwlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " blz %0, 2f \n" > - " addi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - " movi %0, 0 \n" > - "2: \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > - > - if (!tmp) > - smp_mb(); > - > - return !tmp; > -} > - > -/* > - * write lock/unlock/trylock > - */ > -static inline void arch_write_lock(arch_rwlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " bnez %0, 1b \n" > - " subi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > - smp_mb(); > -} > - > -static inline void arch_write_unlock(arch_rwlock_t *lock) > -{ > - smp_mb(); > - WRITE_ONCE(lock->lock, 0); > -} > - > -static inline int arch_write_trylock(arch_rwlock_t *lock) > -{ > - u32 *p = &lock->lock; > - u32 tmp; > - > - asm volatile ( > - "1: ldex.w %0, (%1) \n" > - " bnez %0, 2f \n" > - " subi %0, 1 \n" > - " stex.w %0, (%1) \n" > - " bez %0, 1b \n" > - " movi %0, 0 \n" > - "2: \n" > - : "=&r" (tmp) > - : "r"(p) > - : "cc"); > - > - if (!tmp) > - smp_mb(); > - > - return !tmp; > -} > - > -#endif /* CONFIG_QUEUED_RWLOCKS */ > #endif /* __ASM_CSKY_SPINLOCK_H */ -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: csky: smp_mb__after_spinlock 2020-08-05 10:41 csky: smp_mb__after_spinlock peterz 2020-08-05 10:44 ` peterz @ 2020-08-07 0:22 ` Guo Ren 2020-08-07 8:12 ` peterz 1 sibling, 1 reply; 5+ messages in thread From: Guo Ren @ 2020-08-07 0:22 UTC (permalink / raw) To: Peter Zijlstra Cc: Ren Guo, Linux Kernel Mailing List, linux-csky, mathieu.desnoyers, Will Deacon Hi Peter, On Thu, Aug 6, 2020 at 3:53 AM <peterz@infradead.org> wrote: > > Hi, > > While doing an audit of smp_mb__after_spinlock, I found that csky > defines it, why? > > CSKY only has smp_mb(), it doesn't override __atomic_acquire_fence or > otherwise special cases it's atomic*_acquire() primitives. It has an > explicit smp_mb() in its arch_spin_lock(). Yes, csky didn't implement ACQUIRE/RELEASE in spinlock.h. So it is a redundant and side-effect wrong macro, we should remove it to fixup. TODO: - implement csky's ACQUIRE/RELEASE barrier > Also, why have two implementations of all the locking? I just kept my baby's codes :P. Now, we could remove it and just keep the ticket's one. BTW, I want to change the spinlock to qspinlock, but csky only has 32-bit atomic operation in hardware. Any plan to deal with this in spinlock? Maybe for the embedded scenario, qspinlock seems a bit heavy, are any tickets-like comm spinlock infrastructures in the plan? -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: csky: smp_mb__after_spinlock 2020-08-07 0:22 ` Guo Ren @ 2020-08-07 8:12 ` peterz 0 siblings, 0 replies; 5+ messages in thread From: peterz @ 2020-08-07 8:12 UTC (permalink / raw) To: Guo Ren Cc: Ren Guo, Linux Kernel Mailing List, linux-csky, mathieu.desnoyers, Will Deacon On Fri, Aug 07, 2020 at 08:22:36AM +0800, Guo Ren wrote: > Hi Peter, > > On Thu, Aug 6, 2020 at 3:53 AM <peterz@infradead.org> wrote: > > > > Hi, > > > > While doing an audit of smp_mb__after_spinlock, I found that csky > > defines it, why? > > > > CSKY only has smp_mb(), it doesn't override __atomic_acquire_fence or > > otherwise special cases it's atomic*_acquire() primitives. It has an > > explicit smp_mb() in its arch_spin_lock(). > > Yes, csky didn't implement ACQUIRE/RELEASE in spinlock.h. So it is a > redundant and side-effect wrong macro, we should remove it to fixup. > > TODO: > - implement csky's ACQUIRE/RELEASE barrier Fair enough; please Cc me when you get to it. > > Also, why have two implementations of all the locking? > > I just kept my baby's codes :P. Now, we could remove it and just keep > the ticket's one. > > BTW, I want to change the spinlock to qspinlock, but csky only has > 32-bit atomic operation in hardware. Yeah, that's okay, you can do 'short' atomics using wider RmW. > Any plan to deal with this in spinlock? > > Maybe for the embedded scenario, qspinlock seems a bit heavy, I tend to agree, qspinlock only really makes sense when you have multiple cache domains and NUMA makes it shine. > are any > tickets-like comm spinlock infrastructures in the plan? No. ticket locks are 'simple' and their implementation fairly straight forward. Also, since the generic code only has cmpxchg, a generic ticket lock would be sub-optimal on LL/SC (or even x86 for that matter). It would look something like this, which I think is pretty terrible for just about any arch compared to what you can do with inline asm. struct ticket_lock { union { atomic_t val; struct { #ifdef __LITTLE_ENDIAN u16 head; u16 tail; #else u16 tail; u16 head; #endif }; }; }; void ticket_lock(struct ticket_lock *lock) { unsigned int val = atomic_read(&lock->val); struct ticket_lock lock; do { lock.val = ATOMIC_INIT(val); lock.head++; } while (!atomic_try_cmpxchg_relaxed(&lock->lock, &val, lock.val)); atomic_cond_read_acquire(&lock->val, (VAL >> 16) == lock.head); } void ticket_unlock(struct ticket_lock *lock) { atomic_fetch_inc_release(&lock->val); } ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-08-07 8:13 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-05 10:41 csky: smp_mb__after_spinlock peterz 2020-08-05 10:44 ` peterz 2020-08-07 0:23 ` Guo Ren 2020-08-07 0:22 ` Guo Ren 2020-08-07 8:12 ` peterz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).