From: Juri Lelli <juri.lelli@arm.com>
To: Luca Abeni <luca.abeni@santannapisa.it>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Claudio Scordino <claudio@evidence.eu.com>,
Steven Rostedt <rostedt@goodmis.org>,
Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Subject: Re: [RFC v4 0/6] CPU reclaiming for SCHED_DEADLINE
Date: Wed, 11 Jan 2017 15:06:47 +0000 [thread overview]
Message-ID: <20170111150647.GK10415@e106622-lin> (raw)
In-Reply-To: <20170111133926.7ec0a5b0@luca>
On 11/01/17 13:39, Luca Abeni wrote:
> Hi Juri,
> (I reply from my new email address)
>
> On Wed, 11 Jan 2017 12:19:51 +0000
> Juri Lelli <juri.lelli@arm.com> wrote:
> [...]
> > > > For example, with my taskset, with a hypothetical perfect balance
> > > > of the whole runqueue, one possible scenario is:
> > > >
> > > > CPU 0 1 2 3
> > > > # TASKS 3 3 3 2
> > > >
> > > > In this case, CPUs 0 1 2 are with 100% of local utilization.
> > > > Thus, the current task on these CPUs will have their runtime
> > > > decreased by GRUB. Meanwhile, the luck tasks in the CPU 3 would
> > > > use an additional time that they "globally" do not have - because
> > > > the system, globally, has a load higher than the 66.6...% of the
> > > > local runqueue. Actually, part of the time decreased from tasks
> > > > on [0-2] are being used by the tasks on 3, until the next
> > > > migration of any task, which will change the luck tasks... but
> > > > without any guaranty that all tasks will be the luck one on every
> > > > activation, causing the problem.
> > > >
> > > > Does it make sense?
> > >
> > > Yes; but my impression is that gEDF will migrate tasks so that the
> > > distribution of the reclaimed CPU bandwidth is almost uniform...
> > > Instead, you saw huge differences in the utilisations (and I do not
> > > think that "compressing" the utilisations from 100% to 95% can
> > > decrease the utilisation of a task from 33% to 25% / 26%... :)
> > >
> >
> > I tried to replicate Daniel's experiment, but I don't see such a
> > skewed allocation. They get a reasonably uniform bandwidth and the
> > trace looks fairly good as well (all processes get to run on the
> > different processors at some time).
>
> With some effort, I replicated the issue noticed by Daniel... I think
> it also depends on the CPU speed (and on good or bad luck :), but the
> "unfair" CPU allocation can actually happen.
Yeah, actual allocation in general varies. I guess the question is: do
we care? We currently don't load balance considering utilizations, only
dynamic deadlines matter.
> I am working on a fix (based on the m-grub modifications proposed at
> last April's SAC - in my original patchset, I over-simplified the
> algorithm).
>
OK, will have a look to next version.
>
> > > I suspect there is something more going on here (might be some bug
> > > in one of my patches). I am trying to better understand what
> > > happened.
> >
> > However, playing with this a bit further, I found out one thing that
> > looks counter-intuitive (at least to me :).
> >
> > Simplifying Daniel's example, let's say that we have one 10/30 task
> > running on a CPU with a 500/1000 global limit. Applying grub_reclaim()
> > formula we have:
> >
> > delta_exec = delta * (0.5 + 0.333) = delta * 0.833
> >
> > Which in practice means that 1ms of real delta (at 1000HZ) corresponds
> > to 0.833ms of virtual delta. Considering this, a 10ms (over 30ms)
> > reservation gets "extended" to ~12ms (over 30ms), that is to say the
> > task consumes 0.4 of the CPU's bandwidth. top seems to back what I'm
> > saying, but am I still talking nonsense? :)
>
> You are right; my "Do not reclaim the whole CPU bandwidth" patch is an
> approximation... I hoped that this approximation could be more precise
> than what it really is.
> I used the "Uact + unreclaimable utilization" equation to avoid
> divisions in grub_reclaim(), but the equation should really be "Uact /
> reclaimable utilization"... So, in your example it is
> delta * 0.3333 / 0.5 = delta * 0.6666
> that results in 15ms over 30ms, as expected.
>
> I'll fix that patch for the next submission.
>
Right, OK.
> > I was expecting that the task could consume 0.5 worth of bandwidth
> > with the given global limit. Is the current behaviour intended?
> >
> > If we want to change this behaviour maybe something like the following
> > might work?
> >
> > delta_exec = (delta * to_ratio((1ULL << 20) - rq->dl.non_deadline_bw,
> > rq->dl.running_bw)) >> 20
> My current patch does
> (delta * rq->dl.running_bw * rq->dl.deadline_bw_inv) >> 20 >> 8;
> where rq->dl.deadline_bw_inv has been set to
> to_ratio(global_rt_runtime(), global_rt_period()) >> 12;
>
> This seems to work fine, and should introduce less overhead than
> to_ratio().
>
Sure, we don't want to do divisions if we can. Why the intermediate
right shifts, though?
Thanks,
- Juri
next prev parent reply other threads:[~2017-01-11 15:06 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-30 11:33 [RFC v4 0/6] CPU reclaiming for SCHED_DEADLINE Luca Abeni
2016-12-30 11:33 ` [RFC v4 1/6] sched/deadline: track the active utilization Luca Abeni
2016-12-30 11:33 ` [RFC v4 2/6] sched/deadline: improve the tracking of " Luca Abeni
2017-01-11 17:05 ` Juri Lelli
2017-01-11 21:22 ` luca abeni
2016-12-30 11:33 ` [RFC v4 3/6] sched/deadline: fix the update of the total -deadline utilization Luca Abeni
2016-12-30 11:33 ` [RFC v4 4/6] sched/deadline: implement GRUB accounting Luca Abeni
2016-12-30 11:33 ` [RFC v4 5/6] sched/deadline: do not reclaim the whole CPU bandwidth Luca Abeni
2016-12-30 11:33 ` [RFC v4 6/6] sched/deadline: make GRUB a task's flag Luca Abeni
2017-01-03 18:58 ` [RFC v4 0/6] CPU reclaiming for SCHED_DEADLINE Daniel Bristot de Oliveira
2017-01-03 21:33 ` luca abeni
2017-01-04 12:17 ` luca abeni
2017-01-04 15:14 ` Daniel Bristot de Oliveira
2017-01-04 16:42 ` Luca Abeni
2017-01-04 18:00 ` Daniel Bristot de Oliveira
2017-01-04 18:30 ` Luca Abeni
2017-01-11 12:19 ` Juri Lelli
2017-01-11 12:39 ` Luca Abeni
2017-01-11 15:06 ` Juri Lelli [this message]
2017-01-11 21:16 ` luca abeni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170111150647.GK10415@e106622-lin \
--to=juri.lelli@arm.com \
--cc=bristot@redhat.com \
--cc=claudio@evidence.eu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luca.abeni@santannapisa.it \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tommaso.cucinotta@sssup.it \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).