From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1760752AbdAKVQ6 (ORCPT <rfc822;w@1wt.eu>);
        Wed, 11 Jan 2017 16:16:58 -0500
Received: from mail.santannapisa.it ([193.205.80.99]:24761 "EHLO
        mail.santannapisa.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751444AbdAKVQz (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 11 Jan 2017 16:16:55 -0500
Date: Wed, 11 Jan 2017 22:16:46 +0100
From: luca abeni <luca.abeni@santannapisa.it>
To: Juri Lelli <juri.lelli@arm.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>,
        linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@redhat.com>,
        Claudio Scordino <claudio@evidence.eu.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Tommaso Cucinotta <tommaso.cucinotta@sssup.it>
Subject: Re: [RFC v4 0/6] CPU reclaiming for SCHED_DEADLINE
Message-ID: <20170111221646.4da97555@sweethome>
In-Reply-To: <20170111150647.GK10415@e106622-lin>
References: <1483097591-3871-1-git-send-email-lucabe72@gmail.com>
        <6c4ab4ec-7164-fafe-5efc-990f3cf31269@redhat.com>
        <20170104131755.573651ca@sweethome>
        <e8701411-5533-4f98-8235-567a6bc81f63@redhat.com>
        <CAKknFTBb0P4=-=MBrhYgFQvhqZUmAF70Uq987EKD7ui4WSuP9Q@mail.gmail.com>
        <ede4cd40-6be1-946d-bfec-b97e097b075d@redhat.com>
        <CAKknFTCFp-PZ7_5PO7G=S2HmHhBPCj7m9eboEJhuspToLa9wRA@mail.gmail.com>
        <20170111121951.GI10415@e106622-lin>
        <20170111133926.7ec0a5b0@luca>
        <20170111150647.GK10415@e106622-lin>
X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, 11 Jan 2017 15:06:47 +0000
Juri Lelli <juri.lelli@arm.com> wrote:

> On 11/01/17 13:39, Luca Abeni wrote:
> > Hi Juri,
> > (I reply from my new email address)
> > 
> > On Wed, 11 Jan 2017 12:19:51 +0000
> > Juri Lelli <juri.lelli@arm.com> wrote:
> > [...]  
> > > > > For example, with my taskset, with a hypothetical perfect
> > > > > balance of the whole runqueue, one possible scenario is:
> > > > >
> > > > >    CPU    0    1     2     3
> > > > > # TASKS   3    3     3     2
> > > > >
> > > > > In this case, CPUs 0 1 2 are with 100% of local utilization.
> > > > > Thus, the current task on these CPUs will have their runtime
> > > > > decreased by GRUB. Meanwhile, the luck tasks in the CPU 3
> > > > > would use an additional time that they "globally" do not have
> > > > > - because the system, globally, has a load higher than the
> > > > > 66.6...% of the local runqueue. Actually, part of the time
> > > > > decreased from tasks on [0-2] are being used by the tasks on
> > > > > 3, until the next migration of any task, which will change
> > > > > the luck tasks... but without any guaranty that all tasks
> > > > > will be the luck one on every activation, causing the problem.
> > > > >
> > > > > Does it make sense?    
> > > > 
> > > > Yes; but my impression is that gEDF will migrate tasks so that
> > > > the distribution of the reclaimed CPU bandwidth is almost
> > > > uniform... Instead, you saw huge differences in the
> > > > utilisations (and I do not think that "compressing" the
> > > > utilisations from 100% to 95% can decrease the utilisation of a
> > > > task from 33% to 25% / 26%... :) 
> > > 
> > > I tried to replicate Daniel's experiment, but I don't see such a
> > > skewed allocation. They get a reasonably uniform bandwidth and the
> > > trace looks fairly good as well (all processes get to run on the
> > > different processors at some time).  
> > 
> > With some effort, I replicated the issue noticed by Daniel... I
> > think it also depends on the CPU speed (and on good or bad luck :),
> > but the "unfair" CPU allocation can actually happen.  
> 
> Yeah, actual allocation in general varies. I guess the question is: do
> we care? We currently don't load balance considering utilizations,
> only dynamic deadlines matter.

Right... But the problem is that with the version of GRUB I proposed
this unfairness can result in some tasks receiving less CPU time than
the guaranteed amount (because some other tasks receive much more). I
think there are at least two possible ways to fix this (without
changing the migration strategy), and I am working on them...
(hopefully, I'll post something in next week)


> > > I was expecting that the task could consume 0.5 worth of bandwidth
> > > with the given global limit. Is the current behaviour intended?
> > > 
> > > If we want to change this behaviour maybe something like the
> > > following might work?
> > > 
> > >  delta_exec = (delta * to_ratio((1ULL << 20) -
> > > rq->dl.non_deadline_bw, rq->dl.running_bw)) >> 20  
> > My current patch does
> > 	(delta * rq->dl.running_bw * rq->dl.deadline_bw_inv) >> 20
> > >> 8; where rq->dl.deadline_bw_inv has been set to
> > 	to_ratio(global_rt_runtime(), global_rt_period()) >> 12;
> > 	
> > This seems to work fine, and should introduce less overhead than
> > to_ratio().
> >   
> 
> Sure, we don't want to do divisions if we can. Why the intermediate
> right shifts, though?

I wrote it like this to remember that ">> 20" comes from how
"to_ratio()" computes the utilization, and the additional ">> 8"
comes from the fact that deadline_bw_inv is shifted left by 8, to avoid
losing precision (I used 8 insted of 20 so that the computation can be
- hopefully - performed on 32 bits... Of course I can revise this if
needed).

If needed I can change the ">> 20 >> 8" in ">> 28", or remove the
">> 12" from the deadline_bw_inv conmputation (so that we can use
">> 40" or ">> 20 >> 20" in grub_reclaim()).


			Thanks,
				Luca