From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753665AbbKBNw2 (ORCPT ); Mon, 2 Nov 2015 08:52:28 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:42370 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752337AbbKBNwY (ORCPT ); Mon, 2 Nov 2015 08:52:24 -0500 Message-Id: <20151102134941.005198372@infradead.org> User-Agent: quilt/0.61-1 Date: Mon, 02 Nov 2015 14:29:05 +0100 From: Peter Zijlstra To: mingo@kernel.org, oleg@redhat.com Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, paulmck@linux.vnet.ibm.com, boqun.feng@gmail.com, corbet@lwn.net, mhocko@kernel.org, dhowells@redhat.com, torvalds@linux-foundation.org, will.deacon@arm.com Subject: [PATCH 4/4] locking: Introduce smp_cond_acquire() References: <20151102132901.157178466@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline; filename=peterz-smp_cond_acquire.patch Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Introduce smp_cond_acquire() which combines a control dependency and a read barrier to form acquire semantics. This primitive has two benefits: - it documents control dependencies, - its typically cheaper than using smp_load_acquire() in a loop. Note that while smp_cond_acquire() has an explicit smp_read_barrier_depends() for Alpha, neither sites it gets used in were actually buggy on Alpha for their lack of it. The first uses smp_rmb(), which on Alpha is a full barrier too and therefore serves its purpose. The second had an explicit full barrier. Signed-off-by: Peter Zijlstra (Intel) --- include/linux/compiler.h | 18 ++++++++++++++++++ kernel/sched/core.c | 8 +------- kernel/task_work.c | 4 ++-- 3 files changed, 21 insertions(+), 9 deletions(-) --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -275,6 +275,24 @@ static __always_inline void __write_once __val; \ }) +/** + * smp_cond_acquire() - Spin wait for cond with ACQUIRE ordering + * @cond: boolean expression to wait for + * + * Equivalent to using smp_load_acquire() on the condition variable but employs + * the control dependency of the wait to reduce the barrier on many platforms. + * + * The control dependency provides a LOAD->STORE order, the additional RMB + * provides LOAD->LOAD order, together they provide LOAD->{LOAD,STORE} order, + * aka. ACQUIRE. + */ +#define smp_cond_acquire(cond) do { \ + while (!(cond)) \ + cpu_relax(); \ + smp_read_barrier_depends(); /* ctrl */ \ + smp_rmb(); /* ctrl + rmb := acquire */ \ +} while (0) + #endif /* __KERNEL__ */ #endif /* __ASSEMBLY__ */ --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2111,19 +2111,13 @@ try_to_wake_up(struct task_struct *p, un /* * If the owning (remote) cpu is still in the middle of schedule() with * this task as prev, wait until its done referencing the task. - */ - while (p->on_cpu) - cpu_relax(); - /* - * Combined with the control dependency above, we have an effective - * smp_load_acquire() without the need for full barriers. * * Pairs with the smp_store_release() in finish_lock_switch(). * * This ensures that tasks getting woken will be fully ordered against * their previous state and preserve Program Order. */ - smp_rmb(); + smp_cond_acquire(!p->on_cpu); p->sched_contributes_to_load = !!task_contributes_to_load(p); p->state = TASK_WAKING; --- a/kernel/task_work.c +++ b/kernel/task_work.c @@ -102,13 +102,13 @@ void task_work_run(void) if (!work) break; + /* * Synchronize with task_work_cancel(). It can't remove * the first entry == work, cmpxchg(task_works) should * fail, but it can play with *work and other entries. */ - raw_spin_unlock_wait(&task->pi_lock); - smp_mb(); + smp_cond_acquire(!raw_spin_is_locked(&task->pi_lock)); do { next = work->next;