linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@parallels.com>
To: Peter Zijlstra <peterz@infradead.org>, Paul Turner <pjt@google.com>
Cc: Ingo Molnar <mingo@redhat.com>, <devel@openvz.org>,
	<linux-kernel@vger.kernel.org>
Subject: [PATCH] sched: initialize runtime to non-zero on cfs bw set
Date: Fri, 8 Feb 2013 11:10:46 +0400	[thread overview]
Message-ID: <1360307446-26978-1-git-send-email-vdavydov@parallels.com> (raw)

If cfs_rq->runtime_remaining is <= 0 then either
- cfs_rq is throttled and waiting for quota redistribution, or
- cfs_rq is currently executing and will be throttled on
  put_prev_entity, or
- cfs_rq is not throttled and has not executed since its quota was set
  (runtime_remaining is set to 0 on cfs bandwidth reconfiguration).

It is obvious that the last case is rather an exception from the rule
"runtime_remaining<=0 iff cfs_rq is throttled or will be throttled as
soon as it finishes its execution". Moreover, it can lead to a task hang
as follows. If put_prev_task is called immediately after first
pick_next_task after quota was set, "immediately" meaning rq->clock in
both functions is the same, then the corresponding cfs_rq will be
throttled. Besides being unfair (the cfs_rq has not executed in fact),
the quota refilling timer can be idle at that time and it won't be
activated on put_prev_task because update_curr calls
account_cfs_rq_runtime, which activates the timer, only if delta_exec is
strictly positive. As a result we can get a task "running" inside a
throttled cfs_rq which will probably never be unthrottled.

To avoid the problem, the patch makes tg_set_cfs_bandwidth initialize
runtime_remaining of each cfs_rq to 1 instead of 0 so that the cfs_rq
will be throttled only if it has executed for some positive number of
nanoseconds.
--
Several times we had our customers encountered such hangs inside a VM
(seems something is wrong or rather different in time accounting there).
Analyzing crash dumps revealed that hung tasks were running inside
cfs_rq's, which had the following setup

cfs_rq->throttled=1
cfs_rq->runtime_enabled=1
cfs_rq->runtime_remaining=0
cfs_rq->tg->cfs_bandwidth.idle=1
cfs_rq->tg->cfs_bandwidth.timer_active=0

which conforms pretty nice to the explanation given above.

Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
---
 kernel/sched/core.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 26058d0..c7a078f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7686,7 +7686,7 @@ static int tg_set_cfs_bandwidth(struct task_group *tg, u64 period, u64 quota)
 
 		raw_spin_lock_irq(&rq->lock);
 		cfs_rq->runtime_enabled = runtime_enabled;
-		cfs_rq->runtime_remaining = 0;
+		cfs_rq->runtime_remaining = 1;
 
 		if (cfs_rq->throttled)
 			unthrottle_cfs_rq(cfs_rq);
-- 
1.7.1


             reply	other threads:[~2013-02-08  7:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-08  7:10 Vladimir Davydov [this message]
2013-02-08 14:46 ` [PATCH] sched: initialize runtime to non-zero on cfs bw set Paul Turner
2013-02-08 15:26   ` Vladimir Davydov
     [not found]   ` <BEF8F492-C44F-43F1-AB39-EA498A0063EA@parallels.com>
2013-02-08 16:32     ` Vladimir Davydov
2013-02-08 15:17 ` [tip:sched/urgent] sched: Initialize cfs_rq-> runtime_remaining " tip-bot for Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1360307446-26978-1-git-send-email-vdavydov@parallels.com \
    --to=vdavydov@parallels.com \
    --cc=devel@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pjt@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).