From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752091AbcFNJJh (ORCPT ); Tue, 14 Jun 2016 05:09:37 -0400 Received: from foss.arm.com ([217.140.101.70]:57248 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751822AbcFNJJf (ORCPT ); Tue, 14 Jun 2016 05:09:35 -0400 Date: Tue, 14 Jun 2016 10:09:34 +0100 From: Juri Lelli To: Peter Zijlstra Cc: mingo@kernel.org, tglx@linutronix.de, rostedt@goodmis.org, xlpang@redhat.com, linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com, jdesfossez@efficios.com, bristot@redhat.com, Ingo Molnar Subject: Re: [RFC][PATCH 1/8] rtmutex: Deboost before waking up the top waiter Message-ID: <20160614090934.GE5981@e106622-lin> References: <20160607195635.710022345@infradead.org> <20160607200215.637804442@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160607200215.637804442@infradead.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I've got only nitpicks for the changelog. Otherwise the patch looks good to me (and yes, without it bw inheritance would be a problem). On 07/06/16 21:56, Peter Zijlstra wrote: > From: Xunlei Pang > > We should deboost before waking the high-prio task, such that > we don't run two tasks with the same "state"(priority, deadline, ^ space > sched_class, etc) during the period between the end of wake_up_q() > and the end of rt_mutex_adjust_prio(). > > As "Peter Zijlstra" said: > Its semantically icky to have the two tasks running off the same s/Its/It's/ > state and practically icky when you consider bandwidth inheritance -- > where the boosted task wants to explicitly modify the state of the > booster. In that latter case you really want to unboost before you > let the booster run again. > > But this however can lead to prio-inversion if current would get > preempted after the deboost but before waking our high-prio task, > hence we disable preemption before doing deboost, and enabling it s/enabling/re-enable/ > after the wake up is over. > > The patch fixed the logic, and introduced rt_mutex_postunlock() s/The/This/ s/fixed/fixes/ s/introduced/introduces/ > to do some code refactor. > > Most importantly however; this change ensures pointer stability for > the next patch, where we have rt_mutex_setprio() cache a pointer to > the top-most waiter task. If we, as before this change, do the wakeup > first and then deboost, this pointer might point into thin air. > > Cc: Steven Rostedt > Cc: Ingo Molnar > Cc: Juri Lelli > Suggested-by: Peter Zijlstra > [peterz: Changelog] > Signed-off-by: Xunlei Pang > Signed-off-by: Peter Zijlstra (Intel) > Link: http://lkml.kernel.org/r/1461659449-19497-1-git-send-email-xlpang@redhat.com Do we have any specific tests for this set? I'm running mine. Best, - Juri > --- > > kernel/futex.c | 5 ++--- > kernel/locking/rtmutex.c | 28 ++++++++++++++++++++++++---- > kernel/locking/rtmutex_common.h | 1 + > 3 files changed, 27 insertions(+), 7 deletions(-) > > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -1336,9 +1336,8 @@ static int wake_futex_pi(u32 __user *uad > * scheduled away before the wake up can take place. > */ > spin_unlock(&hb->lock); > - wake_up_q(&wake_q); > - if (deboost) > - rt_mutex_adjust_prio(current); > + > + rt_mutex_postunlock(&wake_q, deboost); > > return 0; > } > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -1390,12 +1390,32 @@ rt_mutex_fastunlock(struct rt_mutex *loc > } else { > bool deboost = slowfn(lock, &wake_q); > > - wake_up_q(&wake_q); > + rt_mutex_postunlock(&wake_q, deboost); > + } > +} > + > > - /* Undo pi boosting if necessary: */ > - if (deboost) > - rt_mutex_adjust_prio(current); > +/* > + * Undo pi boosting (if necessary) and wake top waiter. > + */ > +void rt_mutex_postunlock(struct wake_q_head *wake_q, bool deboost) > +{ > + /* > + * We should deboost before waking the top waiter task such that > + * we don't run two tasks with the 'same' priority. This however > + * can lead to prio-inversion if we would get preempted after > + * the deboost but before waking our high-prio task, hence the > + * preempt_disable. > + */ > + if (deboost) { > + preempt_disable(); > + rt_mutex_adjust_prio(current); > } > + > + wake_up_q(wake_q); > + > + if (deboost) > + preempt_enable(); > } > > /** > --- a/kernel/locking/rtmutex_common.h > +++ b/kernel/locking/rtmutex_common.h > @@ -111,6 +111,7 @@ extern int rt_mutex_finish_proxy_lock(st > extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to); > extern bool rt_mutex_futex_unlock(struct rt_mutex *lock, > struct wake_q_head *wqh); > +extern void rt_mutex_postunlock(struct wake_q_head *wake_q, bool deboost); > extern void rt_mutex_adjust_prio(struct task_struct *task); > > #ifdef CONFIG_DEBUG_RT_MUTEXES > >