From: Peter Zijlstra <peterz@infradead.org> To: Arnd Bergmann <arnd@arndb.de> Cc: Stafford Horne <shorne@gmail.com>, Guo Ren <guoren@kernel.org>, linux-riscv <linux-riscv@lists.infradead.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-csky@vger.kernel.org, linux-arch <linux-arch@vger.kernel.org>, Guo Ren <guoren@linux.alibaba.com>, Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>, Waiman Long <longman@redhat.com>, Anup Patel <anup@brainfault.org> Subject: Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 Date: Wed, 7 Apr 2021 13:36:45 +0200 [thread overview] Message-ID: <YG2ZTSFMGrikYWuL@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <CAK8P3a3Pf3TbGoVP7JP7gfPV-WDM8MHV_hdqSwNKKFDr1Sb3zQ@mail.gmail.com> On Wed, Apr 07, 2021 at 10:42:50AM +0200, Arnd Bergmann wrote: > Since there are really only a handful of instances in the kernel > that use the cmpxchg() or xchg() on u8/u16 variables, it would seem > best to just disallow those completely Not going to happen. xchg16 is optimal for qspinlock and if we replace that with a cmpxchg loop on x86 we're regressing. > Interestingly, the s390 version using __sync_val_compare_and_swap() > seems to produce nice output on all architectures that have atomic > instructions, with any supported compiler, to the point where I think > we could just use that to replace most of the inline-asm versions except > for arm64: > > #define cmpxchg(ptr, o, n) \ > ({ \ > __typeof__(*(ptr)) __o = (o); \ > __typeof__(*(ptr)) __n = (n); \ > (__typeof__(*(ptr))) __sync_val_compare_and_swap((ptr),__o,__n);\ > }) It generates the LL/SC loop, but doesn't do sensible optimizations when it's again used in a loop itself. That is, it generates a loop of a loop, just like what you'd expect, which is sub-optimal for LL/SC. > Not how gcc's acquire/release behavior of __sync_val_compare_and_swap() > relates to what the kernel wants here. > > The gcc documentation also recommends using the standard > __atomic_compare_exchange_n() builtin instead, which would allow > constructing release/acquire/relaxed versions as well, but I could not > get it to produce equally good output. (possibly I was using it wrong) I'm scared to death of the C11 crap, the compiler will 'optimize' them when it feels like it and use the C11 memory model rules for it, which are not compatible with the kernel rules. But the same thing applies, it won't do the right thing for composites.
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org> To: Arnd Bergmann <arnd@arndb.de> Cc: Stafford Horne <shorne@gmail.com>, Guo Ren <guoren@kernel.org>, linux-riscv <linux-riscv@lists.infradead.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, linux-csky@vger.kernel.org, linux-arch <linux-arch@vger.kernel.org>, Guo Ren <guoren@linux.alibaba.com>, Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>, Waiman Long <longman@redhat.com>, Anup Patel <anup@brainfault.org> Subject: Re: [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 Date: Wed, 7 Apr 2021 13:36:45 +0200 [thread overview] Message-ID: <YG2ZTSFMGrikYWuL@hirez.programming.kicks-ass.net> (raw) In-Reply-To: <CAK8P3a3Pf3TbGoVP7JP7gfPV-WDM8MHV_hdqSwNKKFDr1Sb3zQ@mail.gmail.com> On Wed, Apr 07, 2021 at 10:42:50AM +0200, Arnd Bergmann wrote: > Since there are really only a handful of instances in the kernel > that use the cmpxchg() or xchg() on u8/u16 variables, it would seem > best to just disallow those completely Not going to happen. xchg16 is optimal for qspinlock and if we replace that with a cmpxchg loop on x86 we're regressing. > Interestingly, the s390 version using __sync_val_compare_and_swap() > seems to produce nice output on all architectures that have atomic > instructions, with any supported compiler, to the point where I think > we could just use that to replace most of the inline-asm versions except > for arm64: > > #define cmpxchg(ptr, o, n) \ > ({ \ > __typeof__(*(ptr)) __o = (o); \ > __typeof__(*(ptr)) __n = (n); \ > (__typeof__(*(ptr))) __sync_val_compare_and_swap((ptr),__o,__n);\ > }) It generates the LL/SC loop, but doesn't do sensible optimizations when it's again used in a loop itself. That is, it generates a loop of a loop, just like what you'd expect, which is sub-optimal for LL/SC. > Not how gcc's acquire/release behavior of __sync_val_compare_and_swap() > relates to what the kernel wants here. > > The gcc documentation also recommends using the standard > __atomic_compare_exchange_n() builtin instead, which would allow > constructing release/acquire/relaxed versions as well, but I could not > get it to produce equally good output. (possibly I was using it wrong) I'm scared to death of the C11 crap, the compiler will 'optimize' them when it feels like it and use the C11 memory model rules for it, which are not compatible with the kernel rules. But the same thing applies, it won't do the right thing for composites. _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2021-04-07 11:37 UTC|newest] Thread overview: 126+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-03-27 18:06 [PATCH v4 0/4] riscv: Add qspinlock/qrwlock guoren 2021-03-27 18:06 ` guoren 2021-03-27 18:06 ` [PATCH v4 1/4] riscv: cmpxchg.h: Cleanup unused code guoren 2021-03-27 18:06 ` guoren 2021-03-27 18:06 ` [PATCH v4 2/4] riscv: cmpxchg.h: Merge macros guoren 2021-03-27 18:06 ` guoren 2021-03-27 21:25 ` Arnd Bergmann 2021-03-27 21:25 ` Arnd Bergmann 2021-03-28 1:50 ` Guo Ren 2021-03-28 1:50 ` Guo Ren 2021-03-27 18:06 ` [PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 guoren 2021-03-27 18:06 ` guoren 2021-03-27 18:43 ` Waiman Long 2021-03-27 18:43 ` Waiman Long 2021-03-28 1:48 ` Guo Ren 2021-03-28 1:48 ` Guo Ren 2021-03-29 7:50 ` Peter Zijlstra 2021-03-29 7:50 ` Peter Zijlstra 2021-03-29 9:41 ` Arnd Bergmann 2021-03-29 9:41 ` Arnd Bergmann 2021-03-29 11:16 ` Peter Zijlstra 2021-03-29 11:16 ` Peter Zijlstra 2021-03-29 11:29 ` Peter Zijlstra 2021-03-29 11:29 ` Peter Zijlstra 2021-03-29 12:52 ` Guo Ren 2021-03-29 12:52 ` Guo Ren 2021-03-29 13:56 ` Arnd Bergmann 2021-03-29 13:56 ` Arnd Bergmann 2021-03-30 2:26 ` Guo Ren 2021-03-30 2:26 ` Guo Ren 2021-03-30 5:51 ` Anup Patel 2021-03-30 5:51 ` Anup Patel 2021-03-30 6:26 ` Guo Ren 2021-03-30 6:26 ` Guo Ren 2021-03-30 7:11 ` Arnd Bergmann 2021-03-30 7:11 ` Arnd Bergmann 2021-03-31 4:18 ` Guo Ren 2021-03-31 4:18 ` Guo Ren 2021-03-31 5:33 ` Paul Campbell 2021-03-31 5:33 ` Paul Campbell 2021-04-05 16:12 ` Guo Ren 2021-04-05 16:12 ` Guo Ren 2021-03-31 6:44 ` Guo Ren 2021-03-31 6:44 ` Guo Ren 2021-03-31 7:12 ` Arnd Bergmann 2021-03-31 7:12 ` Arnd Bergmann 2021-03-29 11:19 ` Guo Ren 2021-03-29 11:19 ` Guo Ren 2021-03-29 11:26 ` Peter Zijlstra 2021-03-29 11:26 ` Peter Zijlstra 2021-03-29 12:01 ` Guo Ren 2021-03-29 12:01 ` Guo Ren 2021-03-29 12:49 ` Peter Zijlstra 2021-03-29 12:49 ` Peter Zijlstra 2021-03-30 3:13 ` Guo Ren 2021-03-30 3:13 ` Guo Ren 2021-03-30 4:54 ` Anup Patel 2021-03-30 4:54 ` Anup Patel 2021-03-30 6:27 ` Guo Ren 2021-03-30 6:27 ` Guo Ren 2021-03-30 8:31 ` David Laight 2021-03-30 8:31 ` David Laight 2021-03-30 14:09 ` Waiman Long 2021-03-30 14:09 ` Waiman Long 2021-03-31 14:47 ` Guo Ren 2021-03-31 14:47 ` Guo Ren 2021-04-05 16:45 ` Guo Ren 2021-04-05 16:45 ` Guo Ren 2021-03-30 16:08 ` Peter Zijlstra 2021-03-30 16:08 ` Peter Zijlstra 2021-03-30 22:35 ` Stafford Horne 2021-03-30 22:35 ` Stafford Horne 2021-03-31 7:23 ` Arnd Bergmann 2021-03-31 7:23 ` Arnd Bergmann 2021-03-31 12:31 ` Stafford Horne 2021-03-31 12:31 ` Stafford Horne 2021-03-31 15:10 ` Guo Ren 2021-03-31 15:10 ` Guo Ren 2021-04-06 8:51 ` Stafford Horne 2021-04-06 8:51 ` Stafford Horne 2021-04-06 3:50 ` Guo Ren 2021-04-06 3:50 ` Guo Ren 2021-04-06 8:56 ` Stafford Horne 2021-04-06 8:56 ` Stafford Horne 2021-04-07 8:42 ` Arnd Bergmann 2021-04-07 8:42 ` Arnd Bergmann 2021-04-07 11:36 ` Peter Zijlstra [this message] 2021-04-07 11:36 ` Peter Zijlstra 2021-04-07 11:57 ` Arnd Bergmann 2021-04-07 11:57 ` Arnd Bergmann 2021-04-07 12:02 ` Peter Zijlstra 2021-04-07 12:02 ` Peter Zijlstra 2021-04-05 16:40 ` Guo Ren 2021-04-05 16:40 ` Guo Ren 2021-03-31 15:22 ` Guo Ren 2021-03-31 15:22 ` Guo Ren 2021-04-06 7:15 ` Peter Zijlstra 2021-04-06 7:15 ` Peter Zijlstra 2021-04-07 9:42 ` Christoph Hellwig 2021-04-07 9:42 ` Christoph Hellwig 2021-04-07 14:29 ` Christoph Müllner 2021-04-07 14:29 ` Christoph Müllner 2021-04-07 14:34 ` Christoph Hellwig 2021-04-07 14:34 ` Christoph Hellwig 2021-04-07 15:51 ` Peter Zijlstra 2021-04-07 15:51 ` Peter Zijlstra 2021-04-07 16:44 ` Peter Zijlstra 2021-04-07 16:44 ` Peter Zijlstra 2021-04-07 15:52 ` Peter Zijlstra 2021-04-07 15:52 ` Peter Zijlstra 2021-04-07 16:54 ` Peter Zijlstra 2021-04-07 16:54 ` Peter Zijlstra 2021-04-07 16:00 ` Peter Zijlstra 2021-04-07 16:00 ` Peter Zijlstra 2021-04-07 19:50 ` Christoph Müllner 2021-04-07 19:50 ` Christoph Müllner 2021-04-06 17:24 ` Boqun Feng 2021-04-06 17:24 ` Boqun Feng 2021-04-07 9:26 ` Peter Zijlstra 2021-04-07 9:26 ` Peter Zijlstra 2021-03-29 12:13 ` Anup Patel 2021-03-29 12:13 ` Anup Patel 2021-03-29 12:54 ` Peter Zijlstra 2021-03-29 12:54 ` Peter Zijlstra 2021-03-27 18:06 ` [PATCH v4 4/4] riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock guoren 2021-03-27 18:06 ` guoren
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=YG2ZTSFMGrikYWuL@hirez.programming.kicks-ass.net \ --to=peterz@infradead.org \ --cc=anup@brainfault.org \ --cc=arnd@arndb.de \ --cc=guoren@kernel.org \ --cc=guoren@linux.alibaba.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-csky@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-riscv@lists.infradead.org \ --cc=longman@redhat.com \ --cc=mingo@redhat.com \ --cc=shorne@gmail.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.