All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yuyang Du <yuyang.du@intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mike Galbraith <efault@gmx.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	bsegall@google.com, morten.rasmussen@arm.com, pjt@google.com,
	steve.muckle@linaro.org, linux-kernel@vger.kernel.org
Subject: Re: Divide-by-zero in post_init_entity_util_avg
Date: Thu, 9 Jun 2016 09:33:24 +0800	[thread overview]
Message-ID: <20160609013324.GH8105@intel.com> (raw)
In-Reply-To: <20160609090142.GS32344@nuc-i3427.alporthouse.com>

On Thu, Jun 09, 2016 at 10:01:42AM +0100, Chris Wilson wrote:
> I've presumed commit 2b8c41daba327 ("sched/fair: Initiate a new task's
> util avg to a bounded value") to be at fault, hence the CCs. Though it
> may just be a victim.
> 
> gdb says 0x43/0x80 is
> 
>    725			if (cfs_rq->avg.util_avg != 0) {
>    726				sa->util_avg  = cfs_rq->avg.util_avg * se->load.weight;
> -> 727				sa->util_avg /= (cfs_rq->avg.load_avg + 1);
>    728	
>    729				if (sa->util_avg > cap)
>    730					sa->util_avg = cap;
>    731			} else {
> 
> I've run the same fork-heavy workload that seemed to hit the initial
> fault under kasan. kasan has not reported any errors, nor has the bug
> reoccurred after a day (earlier I had a couple of panics within a few
> hours). 
> 
> Is it possible for a race window where cfg_rq->avg.load_avg is indeed
> -1? Any evidence of other memcorruption in the above?

-1 should not be possible, sounds like a soft error.

But, a race is anyway hazardous. Thanks a lot, Chris.

--
Subject: [PATCH] sched/fair: Avoid hazardous reading cfs_rq->avg.load_avg
 without rq lock

The commit 2b8c41daba327 ("sched/fair: Initiate a new task's util avg
to a bounded value") references cfs_rq->avg.load_avg and then the value
is used as a divisor (actually cfs_rq->avg.load_avg + 1).

This race condition may cause a divide-by-zero exception. Fix it by
moving it into rq locked section.

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Yuyang Du <yuyang.du@intel.com>
---
 kernel/sched/core.c |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 385c947..b9f44df 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2535,10 +2535,9 @@ void wake_up_new_task(struct task_struct *p)
 	 */
 	set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0));
 #endif
+	rq = __task_rq_lock(p, &rf);
 	/* Post initialize new task's util average when its cfs_rq is set */
 	post_init_entity_util_avg(&p->se);
-
-	rq = __task_rq_lock(p, &rf);
 	activate_task(rq, p, 0);
 	p->on_rq = TASK_ON_RQ_QUEUED;
 	trace_sched_wakeup_new(p);
-- 

  reply	other threads:[~2016-06-09  9:30 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-09  9:01 Divide-by-zero in post_init_entity_util_avg Chris Wilson
2016-06-09  1:33 ` Yuyang Du [this message]
2016-06-09 13:07   ` Peter Zijlstra
2016-06-12 22:25     ` Yuyang Du
2016-06-14 11:25     ` [tip:sched/core] sched/fair: Fix post_init_entity_util_avg() serialization tip-bot for Peter Zijlstra
2016-06-16  8:50     ` Divide-by-zero in post_init_entity_util_avg Peter Zijlstra
2016-06-16 12:25       ` Peter Zijlstra
2016-06-16 16:16         ` Peter Zijlstra
2016-06-17  8:16         ` Andrey Ryabinin
2016-06-17  8:23           ` Peter Zijlstra
2016-06-17  9:19         ` [PATCH] sched/fair: Fix cfs_rq avg tracking underflow Peter Zijlstra
2016-06-17  2:01           ` Yuyang Du
2016-06-20 13:24           ` [tip:sched/urgent] " tip-bot for Peter Zijlstra
2016-06-09 10:29 ` Divide-by-zero in post_init_entity_util_avg Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160609013324.GH8105@intel.com \
    --to=yuyang.du@intel.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=bsegall@google.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    --cc=steve.muckle@linaro.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.