From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753400AbbJFQfa (ORCPT ); Tue, 6 Oct 2015 12:35:30 -0400 Received: from foss.arm.com ([217.140.101.70]:53334 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753093AbbJFQf1 (ORCPT ); Tue, 6 Oct 2015 12:35:27 -0400 Date: Tue, 6 Oct 2015 17:35:22 +0100 From: Will Deacon To: Peter Zijlstra Cc: Oleg Nesterov , Boqun Feng , "Paul E. McKenney" , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Jonathan Corbet , Michal Hocko , David Howells , Linus Torvalds Subject: Re: [PATCH] Documentation: Remove misleading examples of the barriers in wake_*() Message-ID: <20151006163521.GD2416@arm.com> References: <1441674841-11498-1-git-send-email-boqun.feng@gmail.com> <20150909192822.GM4029@linux.vnet.ibm.com> <20150910021612.GC18494@fixme-laptop.cn.ibm.com> <20150910175557.GA20640@redhat.com> <20150917130125.GL3816@twins.programming.kicks-ass.net> <20150917170111.GA29215@redhat.com> <20150918064956.GQ3816@twins.programming.kicks-ass.net> <20150921174611.GA28059@redhat.com> <20151006160450.GS3604@twins.programming.kicks-ass.net> <20151006162423.GH11639@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151006162423.GH11639@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 06, 2015 at 06:24:23PM +0200, Peter Zijlstra wrote: > On Tue, Oct 06, 2015 at 06:04:50PM +0200, Peter Zijlstra wrote: > > On Mon, Sep 21, 2015 at 07:46:11PM +0200, Oleg Nesterov wrote: > > > On 09/18, Peter Zijlstra wrote: > > > > > > > > the text is correct, right? > > > > > > Yes, it looks good to me and helpful. > > > > > > But damn. I forgot why exactly try_to_wake_up() needs rmb() after > > > ->on_cpu check... It looks reasonable in any case, but I do not > > > see any strong reason immediately. > > > > I read it like the smp_rmb() we have for > > acquire__after_spin_is_unlocked. Except, as you note below, we need to > > need an smp_read_barrier_depends for control barriers as well.... > > > Yes, but I'm not sure we should go write: > > > > while (READ_ONCE_CTRL(p->on_cpu)) > > cpu_relax(); > > > > Or: > > > > while (p->on_cpu) > > cpu_relax(); > > > > smp_read_barrier_depends(); > > > > It seems to me that doing the smp_mb() (for Alpha) inside the loop might > > be sub-optimal. > > And also referring to: > > lkml.kernel.org/r/20150812133109.GA8266@redhat.com > > Do we want something like this? > > #define smp_spin_acquire(cond) do { \ > while (cond) \ > cpu_relax(); \ > smp_read_barrier_depends(); /* ctrl */ \ > smp_rmb(); /* ctrl + rmb := acquire */ \ > } while (0) > > And use it like: > > smp_spin_acquire(raw_spin_is_locked(&task->pi_lock)); > > That might work for your task_work_run() and the scheduler case, > although it might be somewhat awkward for sem_wait_array(). I could *really* use something like this for implementing power-saving busy loops for arch/arm64 (i.e. in the qrwlock code). We have a WFE instruction (wait for event) that can stop the processor clock and resume it when the exclusive monitor is cleared (i.e. a cacheline migrates to another CPU). That means we can implement a targetted wake-up when an unlocker writes to a node in a queued lock, which isn't something expressible with cpu_relax alone. Will