From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754641AbcIEHtK (ORCPT ); Mon, 5 Sep 2016 03:49:10 -0400 Received: from merlin.infradead.org ([205.233.59.134]:57354 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750945AbcIEHtI (ORCPT ); Mon, 5 Sep 2016 03:49:08 -0400 Date: Mon, 5 Sep 2016 09:48:58 +0200 From: Peter Zijlstra To: Benjamin Herrenschmidt Cc: Balbir Singh , LKML , Oleg Nesterov , Nicholas Piggin , Alexey Kardashevskiy Subject: Re: [RESEND][v2][PATCH] Fix a race between try_to_wake_up() and a woken up task Message-ID: <20160905074858.GP10153@twins.programming.kicks-ass.net> References: <33872563-f4bb-0b01-6848-aaa8482aa91f@gmail.com> <1473059659.2313.45.camel@kernel.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1473059659.2313.45.camel@kernel.crashing.org> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 05, 2016 at 05:14:19PM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2016-09-05 at 13:16 +1000, Balbir Singh wrote: > >  .../... > > > > Cc: Peter Zijlstra > > Cc: Nicholas Piggin > > Acked-by: Benjamin Herrenschmidt > > > Signed-off-by: Balbir Singh > > --- > >  kernel/sched/core.c | 11 +++++++++++ > >  1 file changed, 11 insertions(+) > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 2a906f2..582c684 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -2016,6 +2016,17 @@ try_to_wake_up(struct task_struct *p, unsigned > > int state, int wake_flags) > >   success = 1; /* we're going to change ->state */ > >   cpu = task_cpu(p); > >   > > + /* > > +  * Ensure we see on_rq and p_state consistently > > +  * > > +  * For example in __rwsem_down_write_failed(), we have > > +  *    [S] ->on_rq = 1 [L] ->state > > +  *    MB  RMB > > +  *    [S] ->state = TASK_UNINTERRUPTIBLE [L] ->on_rq > > +  * In the absence of the RMB p->on_rq can be observed to be 0 > > +  * and we end up spinning indefinitely in while (p->on_cpu) > > +  */ So I did replace that comment with the one I proposed earlier. I checked a fair number of architectures and many did not have an obvious barrier in switch_to(). So that is not something we can rely on, nor do we need to I think. > > + smp_rmb(); > >   if (p->on_rq && ttwu_remote(p, wake_flags)) > >   goto stat; > >