From: Will Deacon <will@kernel.org> To: Peter Zijlstra <peterz@infradead.org> Cc: "Guo Ren" <guoren@kernel.org>, "Christoph Müllner" <christophm30@gmail.com>, "Palmer Dabbelt" <palmer@dabbelt.com>, "Anup Patel" <anup@brainfault.org>, linux-riscv <linux-riscv@lists.infradead.org>, "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>, "Guo Ren" <guoren@linux.alibaba.com>, "Catalin Marinas" <catalin.marinas@arm.com>, "Will Deacon" <will.deacon@arm.com>, "Arnd Bergmann" <arnd@arndb.de>, jonas@southpole.se, stefan.kristiansson@saunalahti.fi, shorne@gmail.com Subject: Re: [RFC][PATCH] locking: Generic ticket-lock Date: Mon, 19 Apr 2021 18:35:43 +0100 [thread overview] Message-ID: <20210419173543.GC31045@willie-the-truck> (raw) In-Reply-To: <YHbBBuVFNnI4kjj3@hirez.programming.kicks-ass.net> On Wed, Apr 14, 2021 at 12:16:38PM +0200, Peter Zijlstra wrote: > How's this then? Compile tested only on openrisc/simple_smp_defconfig. > > diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h > index d74b13825501..a7a1296b0b4d 100644 > --- a/include/asm-generic/qspinlock.h > +++ b/include/asm-generic/qspinlock.h > @@ -2,6 +2,36 @@ > /* > * Queued spinlock > * > + * A 'generic' spinlock implementation that is based on MCS locks. An > + * architecture that's looking for a 'generic' spinlock, please first consider > + * ticket-lock.h and only come looking here when you've considered all the > + * constraints below and can show your hardware does actually perform better > + * with qspinlock. > + * > + * > + * It relies on atomic_*_release()/atomic_*_acquire() to be RCsc (or no weaker > + * than RCtso if you're power), where regular code only expects atomic_t to be > + * RCpc. Maybe capitalise "Power" to make it clear this about the architecture? > + * > + * It relies on a far greater (compared to ticket-lock.h) set of atomic > + * operations to behave well together, please audit them carefully to ensure > + * they all have forward progress. Many atomic operations may default to > + * cmpxchg() loops which will not have good forward progress properties on > + * LL/SC architectures. > + * > + * One notable example is atomic_fetch_or_acquire(), which x86 cannot (cheaply) > + * do. Carefully read the patches that introduced queued_fetch_set_pending_acquire(). > + * > + * It also heavily relies on mixed size atomic operations, in specific it > + * requires architectures to have xchg16; something which many LL/SC > + * architectures need to implement as a 32bit and+or in order to satisfy the > + * forward progress guarantees mentioned above. > + * > + * Further reading on mixed size atomics that might be relevant: > + * > + * http://www.cl.cam.ac.uk/~pes20/popl17/mixed-size.pdf > + * > + * > * (C) Copyright 2013-2015 Hewlett-Packard Development Company, L.P. > * (C) Copyright 2015 Hewlett-Packard Enterprise Development LP > * > diff --git a/include/asm-generic/ticket-lock-types.h b/include/asm-generic/ticket-lock-types.h > new file mode 100644 > index 000000000000..829759aedda8 > --- /dev/null > +++ b/include/asm-generic/ticket-lock-types.h > @@ -0,0 +1,11 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H > +#define __ASM_GENERIC_TICKET_LOCK_TYPES_H > + > +#include <linux/types.h> > +typedef atomic_t arch_spinlock_t; > + > +#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0) > + > +#endif /* __ASM_GENERIC_TICKET_LOCK_TYPES_H */ > diff --git a/include/asm-generic/ticket-lock.h b/include/asm-generic/ticket-lock.h > new file mode 100644 > index 000000000000..3f0d53e21a37 > --- /dev/null > +++ b/include/asm-generic/ticket-lock.h > @@ -0,0 +1,86 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +/* > + * 'Generic' ticket-lock implementation. > + * > + * It relies on atomic_fetch_add() having well defined forward progress > + * guarantees under contention. If your architecture cannot provide this, stick > + * to a test-and-set lock. > + * > + * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a > + * sub-word of the value. This is generally true for anything LL/SC although > + * you'd be hard pressed to find anything useful in architecture specifications > + * about this. If your architecture cannot do this you might be better off with > + * a test-and-set. > + * > + * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence > + * uses atomic_fetch_add() which is SC to create an RCsc lock. > + * > + * The implementation uses smp_cond_load_acquire() to spin, so if the > + * architecture has WFE like instructions to sleep instead of poll for word > + * modifications be sure to implement that (see ARM64 for example). > + * > + */ > + > +#ifndef __ASM_GENERIC_TICKET_LOCK_H > +#define __ASM_GENERIC_TICKET_LOCK_H > + > +#include <linux/atomic.h> > +#include <asm/ticket-lock-types.h> > + > +static __always_inline void ticket_lock(arch_spinlock_t *lock) > +{ > + u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */ I hate to say it, but smp_mb__after_unlock_lock() would make the intention a lot clearer here :( That is, the implementation as you have it gives stronger than RCsc semantics for all architectures. Alternatively, we could write the thing RCpc and throw an smp_mb() into the unlock path if CONFIG_ARCH_WEAK_RELEASE_ACQUIRE. > + u16 ticket = val >> 16; > + > + if (ticket == (u16)val) > + return; > + > + atomic_cond_read_acquire(lock, ticket == (u16)VAL); > +} > + > +static __always_inline bool ticket_trylock(arch_spinlock_t *lock) > +{ > + u32 old = atomic_read(lock); > + > + if ((old >> 16) != (old & 0xffff)) > + return false; > + > + return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */ > +} > + > +static __always_inline void ticket_unlock(arch_spinlock_t *lock) > +{ > + u16 *ptr = (u16 *)lock + __is_defined(__BIG_ENDIAN); > + u32 val = atomic_read(lock); > + > + smp_store_release(ptr, (u16)val + 1); > +} > + > +static __always_inline int ticket_is_locked(arch_spinlock_t *lock) > +{ > + u32 val = atomic_read(lock); > + > + return ((val >> 16) != (val & 0xffff)); > +} > + > +static __always_inline int ticket_is_contended(arch_spinlock_t *lock) > +{ > + u32 val = atomic_read(lock); > + > + return (s16)((val >> 16) - (val & 0xffff)) > 1; Does this go wonky if the tickets are in the process of wrapping around? Will
WARNING: multiple messages have this Message-ID (diff)
From: Will Deacon <will@kernel.org> To: Peter Zijlstra <peterz@infradead.org> Cc: "Guo Ren" <guoren@kernel.org>, "Christoph Müllner" <christophm30@gmail.com>, "Palmer Dabbelt" <palmer@dabbelt.com>, "Anup Patel" <anup@brainfault.org>, linux-riscv <linux-riscv@lists.infradead.org>, "Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>, "Guo Ren" <guoren@linux.alibaba.com>, "Catalin Marinas" <catalin.marinas@arm.com>, "Will Deacon" <will.deacon@arm.com>, "Arnd Bergmann" <arnd@arndb.de>, jonas@southpole.se, stefan.kristiansson@saunalahti.fi, shorne@gmail.com Subject: Re: [RFC][PATCH] locking: Generic ticket-lock Date: Mon, 19 Apr 2021 18:35:43 +0100 [thread overview] Message-ID: <20210419173543.GC31045@willie-the-truck> (raw) In-Reply-To: <YHbBBuVFNnI4kjj3@hirez.programming.kicks-ass.net> On Wed, Apr 14, 2021 at 12:16:38PM +0200, Peter Zijlstra wrote: > How's this then? Compile tested only on openrisc/simple_smp_defconfig. > > diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h > index d74b13825501..a7a1296b0b4d 100644 > --- a/include/asm-generic/qspinlock.h > +++ b/include/asm-generic/qspinlock.h > @@ -2,6 +2,36 @@ > /* > * Queued spinlock > * > + * A 'generic' spinlock implementation that is based on MCS locks. An > + * architecture that's looking for a 'generic' spinlock, please first consider > + * ticket-lock.h and only come looking here when you've considered all the > + * constraints below and can show your hardware does actually perform better > + * with qspinlock. > + * > + * > + * It relies on atomic_*_release()/atomic_*_acquire() to be RCsc (or no weaker > + * than RCtso if you're power), where regular code only expects atomic_t to be > + * RCpc. Maybe capitalise "Power" to make it clear this about the architecture? > + * > + * It relies on a far greater (compared to ticket-lock.h) set of atomic > + * operations to behave well together, please audit them carefully to ensure > + * they all have forward progress. Many atomic operations may default to > + * cmpxchg() loops which will not have good forward progress properties on > + * LL/SC architectures. > + * > + * One notable example is atomic_fetch_or_acquire(), which x86 cannot (cheaply) > + * do. Carefully read the patches that introduced queued_fetch_set_pending_acquire(). > + * > + * It also heavily relies on mixed size atomic operations, in specific it > + * requires architectures to have xchg16; something which many LL/SC > + * architectures need to implement as a 32bit and+or in order to satisfy the > + * forward progress guarantees mentioned above. > + * > + * Further reading on mixed size atomics that might be relevant: > + * > + * http://www.cl.cam.ac.uk/~pes20/popl17/mixed-size.pdf > + * > + * > * (C) Copyright 2013-2015 Hewlett-Packard Development Company, L.P. > * (C) Copyright 2015 Hewlett-Packard Enterprise Development LP > * > diff --git a/include/asm-generic/ticket-lock-types.h b/include/asm-generic/ticket-lock-types.h > new file mode 100644 > index 000000000000..829759aedda8 > --- /dev/null > +++ b/include/asm-generic/ticket-lock-types.h > @@ -0,0 +1,11 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H > +#define __ASM_GENERIC_TICKET_LOCK_TYPES_H > + > +#include <linux/types.h> > +typedef atomic_t arch_spinlock_t; > + > +#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0) > + > +#endif /* __ASM_GENERIC_TICKET_LOCK_TYPES_H */ > diff --git a/include/asm-generic/ticket-lock.h b/include/asm-generic/ticket-lock.h > new file mode 100644 > index 000000000000..3f0d53e21a37 > --- /dev/null > +++ b/include/asm-generic/ticket-lock.h > @@ -0,0 +1,86 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +/* > + * 'Generic' ticket-lock implementation. > + * > + * It relies on atomic_fetch_add() having well defined forward progress > + * guarantees under contention. If your architecture cannot provide this, stick > + * to a test-and-set lock. > + * > + * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a > + * sub-word of the value. This is generally true for anything LL/SC although > + * you'd be hard pressed to find anything useful in architecture specifications > + * about this. If your architecture cannot do this you might be better off with > + * a test-and-set. > + * > + * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence > + * uses atomic_fetch_add() which is SC to create an RCsc lock. > + * > + * The implementation uses smp_cond_load_acquire() to spin, so if the > + * architecture has WFE like instructions to sleep instead of poll for word > + * modifications be sure to implement that (see ARM64 for example). > + * > + */ > + > +#ifndef __ASM_GENERIC_TICKET_LOCK_H > +#define __ASM_GENERIC_TICKET_LOCK_H > + > +#include <linux/atomic.h> > +#include <asm/ticket-lock-types.h> > + > +static __always_inline void ticket_lock(arch_spinlock_t *lock) > +{ > + u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */ I hate to say it, but smp_mb__after_unlock_lock() would make the intention a lot clearer here :( That is, the implementation as you have it gives stronger than RCsc semantics for all architectures. Alternatively, we could write the thing RCpc and throw an smp_mb() into the unlock path if CONFIG_ARCH_WEAK_RELEASE_ACQUIRE. > + u16 ticket = val >> 16; > + > + if (ticket == (u16)val) > + return; > + > + atomic_cond_read_acquire(lock, ticket == (u16)VAL); > +} > + > +static __always_inline bool ticket_trylock(arch_spinlock_t *lock) > +{ > + u32 old = atomic_read(lock); > + > + if ((old >> 16) != (old & 0xffff)) > + return false; > + > + return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */ > +} > + > +static __always_inline void ticket_unlock(arch_spinlock_t *lock) > +{ > + u16 *ptr = (u16 *)lock + __is_defined(__BIG_ENDIAN); > + u32 val = atomic_read(lock); > + > + smp_store_release(ptr, (u16)val + 1); > +} > + > +static __always_inline int ticket_is_locked(arch_spinlock_t *lock) > +{ > + u32 val = atomic_read(lock); > + > + return ((val >> 16) != (val & 0xffff)); > +} > + > +static __always_inline int ticket_is_contended(arch_spinlock_t *lock) > +{ > + u32 val = atomic_read(lock); > + > + return (s16)((val >> 16) - (val & 0xffff)) > 1; Does this go wonky if the tickets are in the process of wrapping around? Will _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2021-04-19 17:35 UTC|newest] Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-03-24 10:14 [PATCH] riscv: locks: introduce ticket-based spinlock implementation guoren 2021-03-24 10:14 ` guoren 2021-03-24 11:09 ` Peter Zijlstra 2021-03-24 11:09 ` Peter Zijlstra 2021-03-24 12:10 ` Guo Ren 2021-03-24 12:10 ` Guo Ren [not found] ` <CAM4kBBK7_s9U2vJbq68yC8WdDEfPQTaCOvn1xds3Si5B-Wpw+A@mail.gmail.com> 2021-03-24 12:23 ` Peter Zijlstra 2021-03-24 12:23 ` Peter Zijlstra 2021-03-24 12:24 ` Guo Ren 2021-03-24 12:24 ` Guo Ren 2021-03-24 12:31 ` Peter Zijlstra 2021-03-24 12:31 ` Peter Zijlstra 2021-03-24 12:28 ` Anup Patel 2021-03-24 12:28 ` Anup Patel 2021-03-24 12:37 ` Peter Zijlstra 2021-03-24 12:37 ` Peter Zijlstra 2021-03-24 12:53 ` Anup Patel 2021-03-24 12:53 ` Anup Patel 2021-04-11 21:11 ` Palmer Dabbelt 2021-04-11 21:11 ` Palmer Dabbelt 2021-04-12 13:32 ` Christoph Müllner 2021-04-12 13:32 ` Christoph Müllner 2021-04-12 14:51 ` Peter Zijlstra 2021-04-12 14:51 ` Peter Zijlstra 2021-04-12 21:21 ` Christoph Müllner 2021-04-12 21:21 ` Christoph Müllner 2021-04-12 17:33 ` Palmer Dabbelt 2021-04-12 17:33 ` Palmer Dabbelt 2021-04-12 21:54 ` Christoph Müllner 2021-04-12 21:54 ` Christoph Müllner 2021-04-13 8:03 ` Peter Zijlstra 2021-04-13 8:03 ` Peter Zijlstra 2021-04-13 8:17 ` Peter Zijlstra 2021-04-13 8:17 ` Peter Zijlstra 2021-04-14 2:26 ` Guo Ren 2021-04-14 2:26 ` Guo Ren 2021-04-14 7:08 ` Peter Zijlstra 2021-04-14 7:08 ` Peter Zijlstra 2021-04-14 9:05 ` Peter Zijlstra 2021-04-14 9:05 ` Peter Zijlstra 2021-04-14 10:16 ` [RFC][PATCH] locking: Generic ticket-lock Peter Zijlstra 2021-04-14 10:16 ` Peter Zijlstra 2021-04-14 12:39 ` Guo Ren 2021-04-14 12:39 ` Guo Ren 2021-04-14 12:55 ` Peter Zijlstra 2021-04-14 12:55 ` Peter Zijlstra 2021-04-14 13:08 ` Peter Zijlstra 2021-04-14 13:08 ` Peter Zijlstra 2021-04-14 15:59 ` David Laight 2021-04-14 15:59 ` David Laight 2021-04-14 12:45 ` Peter Zijlstra 2021-04-14 12:45 ` Peter Zijlstra 2021-04-14 21:02 ` Stafford Horne 2021-04-14 21:02 ` Stafford Horne 2021-04-14 20:47 ` Stafford Horne 2021-04-14 20:47 ` Stafford Horne 2021-04-15 8:09 ` Peter Zijlstra 2021-04-15 8:09 ` Peter Zijlstra 2021-04-15 9:02 ` Catalin Marinas 2021-04-15 9:02 ` Catalin Marinas 2021-04-15 9:22 ` Will Deacon 2021-04-15 9:22 ` Will Deacon 2021-04-15 9:24 ` Peter Zijlstra 2021-04-15 9:24 ` Peter Zijlstra 2021-04-19 17:35 ` Will Deacon [this message] 2021-04-19 17:35 ` Will Deacon 2021-04-23 6:44 ` Palmer Dabbelt 2021-04-23 6:44 ` Palmer Dabbelt 2021-04-13 9:22 ` [PATCH] riscv: locks: introduce ticket-based spinlock implementation Christoph Müllner 2021-04-13 9:22 ` Christoph Müllner 2021-04-13 9:30 ` Catalin Marinas 2021-04-13 9:30 ` Catalin Marinas 2021-04-13 9:55 ` Christoph Müllner 2021-04-13 9:55 ` Christoph Müllner 2021-04-14 0:23 ` Guo Ren 2021-04-14 0:23 ` Guo Ren 2021-04-14 9:17 ` Catalin Marinas 2021-04-14 9:17 ` Catalin Marinas 2021-04-13 9:35 ` Peter Zijlstra 2021-04-13 9:35 ` Peter Zijlstra 2021-04-13 10:25 ` Christoph Müllner 2021-04-13 10:25 ` Christoph Müllner 2021-04-13 10:45 ` Catalin Marinas 2021-04-13 10:45 ` Catalin Marinas 2021-04-13 10:54 ` David Laight 2021-04-13 10:54 ` David Laight 2021-04-14 5:54 ` Guo Ren 2021-04-14 5:54 ` Guo Ren 2021-04-13 11:04 ` Christoph Müllner 2021-04-13 11:04 ` Christoph Müllner 2021-04-13 13:19 ` Guo Ren 2021-04-13 13:19 ` Guo Ren
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210419173543.GC31045@willie-the-truck \ --to=will@kernel.org \ --cc=anup@brainfault.org \ --cc=arnd@arndb.de \ --cc=catalin.marinas@arm.com \ --cc=christophm30@gmail.com \ --cc=guoren@kernel.org \ --cc=guoren@linux.alibaba.com \ --cc=jonas@southpole.se \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-riscv@lists.infradead.org \ --cc=palmer@dabbelt.com \ --cc=peterz@infradead.org \ --cc=shorne@gmail.com \ --cc=stefan.kristiansson@saunalahti.fi \ --cc=will.deacon@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.