From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752458AbaBLOay (ORCPT ); Wed, 12 Feb 2014 09:30:54 -0500 Received: from forward4m.mail.yandex.net ([37.140.138.4]:50619 "EHLO forward4m.mail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752093AbaBLOaw (ORCPT ); Wed, 12 Feb 2014 09:30:52 -0500 X-Greylist: delayed 385 seconds by postgrey-1.27 at vger.kernel.org; Wed, 12 Feb 2014 09:30:52 EST From: Kirill Tkhai To: Peter Zijlstra Cc: "mingo@kernel.org" , "hpa@zytor.com" , "linux-kernel@vger.kernel.org" , "tglx@linutronix.de" , "linux-tip-commits@vger.kernel.org" , Steven Rostedt , Juri Lelli , Dan Carpenter In-Reply-To: <20140212140620.GD9987@twins.programming.kicks-ass.net> References: <1328936700.2476.17.camel@laptop> <124891392188453@web4h.yandex.ru> <20140212140620.GD9987@twins.programming.kicks-ass.net> Subject: Re: [tip:sched/core] sched: Push put_prev_task() into pick_next_task( ) MIME-Version: 1.0 Message-Id: <364441392215058@web18m.yandex.ru> X-Mailer: Yamail [ http://yandex.ru ] 5.0 Date: Wed, 12 Feb 2014 18:24:18 +0400 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=koi8-r Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 12.02.2014, 18:06, "Peter Zijlstra" : > On Wed, Feb 12, 2014 at 11:00:53AM +0400, Kirill Tkhai wrote: > >>> š@@ -4748,7 +4743,7 @@ static void migrate_tasks(unsigned int dead_cpu) >>> ššššššššššššššššššif (rq->nr_running == 1) >>> ššššššššššššššššššššššššššbreak; >>> >>> š- next = pick_next_task(rq); >>> š+ next = pick_next_task(rq, NULL); >> špick_next_task() firstly checks for prev->sched_class, doesn't it ??? >> >> šThe same for pick_next_task_rt(). > > OK, how about something like this? Good for me. Maybe pack prev->sched_class->put_prev_task() into inline function? Kirill > --- > Subject: sched: Fix hotplug task migration > From: Peter Zijlstra > Date: Wed, 12 Feb 2014 10:49:30 +0100 > > Dan Carpenter reported: > >> škernel/sched/rt.c:1347 pick_next_task_rt() warn: variable dereferenced before check 'prev' (see line 1338) >> škernel/sched/deadline.c:1011 pick_next_task_dl() warn: variable dereferenced before check 'prev' (see line 1005) > > Kirill also spotted that migrate_tasks() will have an instant NULL > deref because pick_next_task() will immediately deref prev. > > Instead of fixing all the corner cases because migrate_tasks() can > pass in a NULL prev task in the unlikely case of hot-un-plug, provide > a fake task such that we can remove all the NULL checks from the far > more common paths. > > A further problem; not previously spotted; is that because we pushed > pre_schedule() and idle_balance() into pick_next_task() we now need to > avoid those getting called and pulling more tasks on our dying CPU. > > We avoid pull_{dl,rt}_task() by setting fake_task.prio to MAX_PRIO+1. > We also note that since we call pick_next_task() exactly the amount of > times we have runnable tasks present, we should never land in > idle_balance(). > > Fixes: 38033c37faab ("sched: Push down pre_schedule() and idle_balance()") > Cc: Steven Rostedt > Cc: Juri Lelli > Cc: Ingo Molnar > Reported-by: Kirill Tkhai > Reported-by: Dan Carpenter > Signed-off-by: Peter Zijlstra > Link: http://lkml.kernel.org/r/20140212094930.GB3545@laptop.programming.kicks-ass.net > --- > škernel/sched/core.c ššššš| šš18 +++++++++++++++++- > škernel/sched/deadline.c š| ššš3 +-- > škernel/sched/fair.c ššššš| ššš5 ++--- > škernel/sched/idle_task.c | ššš3 +-- > škernel/sched/rt.c ššššššš| ššš3 +-- > škernel/sched/stop_task.c | ššš3 +-- > š6 files changed, 23 insertions(+), 12 deletions(-) > > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4681,6 +4681,22 @@ static void calc_load_migrate(struct rq > šššššššššššššššššatomic_long_add(delta, &calc_load_tasks); > š} > > +static void put_prev_task_fake(struct rq *rq, struct task_struct *prev) > +{ > +} > + > +static const struct sched_class fake_sched_class = { > + .put_prev_task = put_prev_task_fake, > +}; > + > +static struct task_struct fake_task = { > + /* > + * Avoid pull_{rt,dl}_task() > + */ > + .prio = MAX_PRIO + 1, > + .sched_class = &fake_sched_class, > +}; > + > š/* > šš* Migrate all tasks from the rq, sleeping tasks will be migrated by > šš* try_to_wake_up()->select_task_rq(). > @@ -4721,7 +4737,7 @@ static void migrate_tasks(unsigned int d > šššššššššššššššššif (rq->nr_running == 1) > šššššššššššššššššššššššššbreak; > > - next = pick_next_task(rq, NULL); > + next = pick_next_task(rq, &fake_task); > šššššššššššššššššBUG_ON(!next); > šššššššššššššššššnext->sched_class->put_prev_task(rq, next); > > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -1008,8 +1008,7 @@ struct task_struct *pick_next_task_dl(st > šššššššššif (unlikely(!dl_rq->dl_nr_running)) > šššššššššššššššššreturn NULL; > > - if (prev) > - prev->sched_class->put_prev_task(rq, prev); > + prev->sched_class->put_prev_task(rq, prev); > > šššššššššdl_se = pick_next_dl_entity(rq, dl_rq); > šššššššššBUG_ON(!dl_se); > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4690,7 +4690,7 @@ pick_next_task_fair(struct rq *rq, struc > šššššššššif (!cfs_rq->nr_running) > šššššššššššššššššgoto idle; > > - if (!prev || prev->sched_class != &fair_sched_class) > + if (prev->sched_class != &fair_sched_class) > šššššššššššššššššgoto simple; > > ššššššššš/* > @@ -4766,8 +4766,7 @@ pick_next_task_fair(struct rq *rq, struc > šššššššššif (!cfs_rq->nr_running) > šššššššššššššššššgoto idle; > > - if (prev) > - prev->sched_class->put_prev_task(rq, prev); > + prev->sched_class->put_prev_task(rq, prev); > > šššššššššdo { > šššššššššššššššššse = pick_next_entity(cfs_rq, NULL); > --- a/kernel/sched/idle_task.c > +++ b/kernel/sched/idle_task.c > @@ -26,8 +26,7 @@ static void check_preempt_curr_idle(stru > šstatic struct task_struct * > špick_next_task_idle(struct rq *rq, struct task_struct *prev) > š{ > - if (prev) > - prev->sched_class->put_prev_task(rq, prev); > + prev->sched_class->put_prev_task(rq, prev); > > šššššššššschedstat_inc(rq, sched_goidle); > š#ifdef CONFIG_SMP > --- a/kernel/sched/rt.c > +++ b/kernel/sched/rt.c > @@ -1344,8 +1344,7 @@ pick_next_task_rt(struct rq *rq, struct > šššššššššif (rt_rq_throttled(rt_rq)) > šššššššššššššššššreturn NULL; > > - if (prev) > - prev->sched_class->put_prev_task(rq, prev); > + prev->sched_class->put_prev_task(rq, prev); > > šššššššššp = _pick_next_task_rt(rq); > > --- a/kernel/sched/stop_task.c > +++ b/kernel/sched/stop_task.c > @@ -31,8 +31,7 @@ pick_next_task_stop(struct rq *rq, struc > šššššššššif (!stop || !stop->on_rq) > šššššššššššššššššreturn NULL; > > - if (prev) > - prev->sched_class->put_prev_task(rq, prev); > + prev->sched_class->put_prev_task(rq, prev); > > šššššššššstop->se.exec_start = rq_clock_task(rq);