From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753765AbbKBRmI (ORCPT ); Mon, 2 Nov 2015 12:42:08 -0500 Received: from foss.arm.com ([217.140.101.70]:51991 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751425AbbKBRmD (ORCPT ); Mon, 2 Nov 2015 12:42:03 -0500 Date: Mon, 2 Nov 2015 17:42:01 +0000 From: Will Deacon To: Peter Zijlstra Cc: mingo@kernel.org, oleg@redhat.com, linux-kernel@vger.kernel.org, paulmck@linux.vnet.ibm.com, boqun.feng@gmail.com, corbet@lwn.net, mhocko@kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org Subject: Re: [PATCH 4/4] locking: Introduce smp_cond_acquire() Message-ID: <20151102174200.GJ29657@arm.com> References: <20151102132901.157178466@infradead.org> <20151102134941.005198372@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151102134941.005198372@infradead.org> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On Mon, Nov 02, 2015 at 02:29:05PM +0100, Peter Zijlstra wrote: > Introduce smp_cond_acquire() which combines a control dependency and a > read barrier to form acquire semantics. > > This primitive has two benefits: > - it documents control dependencies, > - its typically cheaper than using smp_load_acquire() in a loop. I'm not sure that's necessarily true on arm64, where we have a native load-acquire instruction, but not a READ -> READ barrier (smp_rmb() orders prior loads against subsequent loads and stores for us). Perhaps we could allow architectures to provide their own definition of smp_cond_acquire in case they can implement it more efficiently? > Note that while smp_cond_acquire() has an explicit > smp_read_barrier_depends() for Alpha, neither sites it gets used in > were actually buggy on Alpha for their lack of it. The first uses > smp_rmb(), which on Alpha is a full barrier too and therefore serves > its purpose. The second had an explicit full barrier. > > Signed-off-by: Peter Zijlstra (Intel) > --- > include/linux/compiler.h | 18 ++++++++++++++++++ > kernel/sched/core.c | 8 +------- > kernel/task_work.c | 4 ++-- > 3 files changed, 21 insertions(+), 9 deletions(-) > > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -275,6 +275,24 @@ static __always_inline void __write_once > __val; \ > }) > > +/** > + * smp_cond_acquire() - Spin wait for cond with ACQUIRE ordering > + * @cond: boolean expression to wait for > + * > + * Equivalent to using smp_load_acquire() on the condition variable but employs > + * the control dependency of the wait to reduce the barrier on many platforms. > + * > + * The control dependency provides a LOAD->STORE order, the additional RMB > + * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order, > + * aka. ACQUIRE. > + */ > +#define smp_cond_acquire(cond) do { \ I think the previous version that you posted/discussed had the actual address of the variable being loaded passed in here too? That would be useful for arm64, where we can wait-until-memory-location-has-changed to save us re-evaluating cond prematurely. > + while (!(cond)) \ > + cpu_relax(); \ > + smp_read_barrier_depends(); /* ctrl */ \ > + smp_rmb(); /* ctrl + rmb := acquire */ \ It's actually stronger than acquire, I think, because accesses before the smp_cond_acquire cannot be moved across it. > +} while (0) > + > #endif /* __KERNEL__ */ > > #endif /* __ASSEMBLY__ */ > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2111,19 +2111,13 @@ try_to_wake_up(struct task_struct *p, un > /* > * If the owning (remote) cpu is still in the middle of schedule() with > * this task as prev, wait until its done referencing the task. > - */ > - while (p->on_cpu) > - cpu_relax(); > - /* > - * Combined with the control dependency above, we have an effective > - * smp_load_acquire() without the need for full barriers. > * > * Pairs with the smp_store_release() in finish_lock_switch(). > * > * This ensures that tasks getting woken will be fully ordered against > * their previous state and preserve Program Order. > */ > - smp_rmb(); > + smp_cond_acquire(!p->on_cpu); > > p->sched_contributes_to_load = !!task_contributes_to_load(p); > p->state = TASK_WAKING; > --- a/kernel/task_work.c > +++ b/kernel/task_work.c > @@ -102,13 +102,13 @@ void task_work_run(void) > > if (!work) > break; > + > /* > * Synchronize with task_work_cancel(). It can't remove > * the first entry == work, cmpxchg(task_works) should > * fail, but it can play with *work and other entries. > */ > - raw_spin_unlock_wait(&task->pi_lock); > - smp_mb(); > + smp_cond_acquire(!raw_spin_is_locked(&task->pi_lock)); Hmm, there's some sort of release equivalent in kernel/exit.c, but I couldn't easily figure out whether we could do anything there. If we could, we could kill raw_spin_unlock_wait :) Will