From: Joel Fernandes <joel@joelfernandes.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>,
mingo@kernel.org, linux-kernel@vger.kernel.org,
juri.lelli@redhat.com, dietmar.eggemann@arm.com,
rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de,
bristot@redhat.com, corbet@lwn.net, qyousef@layalina.io,
chris.hyser@oracle.com, patrick.bellasi@matbug.net,
pjt@google.com, pavel@ucw.cz, qperret@google.com,
tim.c.chen@linux.intel.com, joshdon@google.com, timj@gnu.org,
kprateek.nayak@amd.com, yu.c.chen@intel.com,
youssefesmat@chromium.org, efault@gmx.de
Subject: Re: [PATCH 14/17] sched/eevdf: Better handle mixed slice length
Date: Tue, 4 Apr 2023 13:50:50 +0000 [thread overview]
Message-ID: <20230404135050.GA471948@google.com> (raw)
In-Reply-To: <20230404092936.GD284733@hirez.programming.kicks-ass.net>
On Tue, Apr 04, 2023 at 11:29:36AM +0200, Peter Zijlstra wrote:
> On Fri, Mar 31, 2023 at 05:26:51PM +0200, Vincent Guittot wrote:
> > On Tue, 28 Mar 2023 at 13:06, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > In the case where (due to latency-nice) there are different request
> > > sizes in the tree, the smaller requests tend to be dominated by the
> > > larger. Also note how the EEVDF lag limits are based on r_max.
> > >
> > > Therefore; add a heuristic that for the mixed request size case, moves
> > > smaller requests to placement strategy #2 which ensures they're
> > > immidiately eligible and and due to their smaller (virtual) deadline
> > > will cause preemption.
> > >
> > > NOTE: this relies on update_entity_lag() to impose lag limits above
> > > a single slice.
> > >
> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > > ---
> > > kernel/sched/fair.c | 14 ++++++++++++++
> > > kernel/sched/features.h | 1 +
> > > kernel/sched/sched.h | 1 +
> > > 3 files changed, 16 insertions(+)
> > >
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -616,6 +616,7 @@ avg_vruntime_add(struct cfs_rq *cfs_rq,
> > > s64 key = entity_key(cfs_rq, se);
> > >
> > > cfs_rq->avg_vruntime += key * weight;
> > > + cfs_rq->avg_slice += se->slice * weight;
> > > cfs_rq->avg_load += weight;
> > > }
> > >
> > > @@ -626,6 +627,7 @@ avg_vruntime_sub(struct cfs_rq *cfs_rq,
> > > s64 key = entity_key(cfs_rq, se);
> > >
> > > cfs_rq->avg_vruntime -= key * weight;
> > > + cfs_rq->avg_slice -= se->slice * weight;
> > > cfs_rq->avg_load -= weight;
> > > }
> > >
> > > @@ -4832,6 +4834,18 @@ place_entity(struct cfs_rq *cfs_rq, stru
> > > lag = se->vlag;
> > >
> > > /*
> > > + * For latency sensitive tasks; those that have a shorter than
> > > + * average slice and do not fully consume the slice, transition
> > > + * to EEVDF placement strategy #2.
> > > + */
> > > + if (sched_feat(PLACE_FUDGE) &&
> > > + cfs_rq->avg_slice > se->slice * cfs_rq->avg_load) {
> > > + lag += vslice;
> > > + if (lag > 0)
> > > + lag = 0;
> >
> > By using different lag policies for tasks, doesn't this create
> > unfairness between tasks ?
>
> Possibly, I've just not managed to trigger it yet -- if it is an issue
> it can be fixed by ensuring we don't place the entity before its
> previous vruntime just like the sleeper hack later on.
>
> > I wanted to stress this situation with a simple use case but it seems
> > that even without changing the slice, there is a fairness problem:
> >
> > Task A always run
> > Task B loops on : running 1ms then sleeping 1ms
> > default nice and latency nice prio bot both
> > each task should get around 50% of the time.
> >
> > The fairness is ok with tip/sched/core
> > but with eevdf, Task B only gets around 30%
> >
> > I haven't identified the problem so far
>
> Heh, this is actually the correct behaviour. If you have a u=1 and a
> u=.5 task, you should distribute time on a 2:1 basis, eg. 67% vs 33%.
Splitting like that sounds like starvation of the sleeper to me. If something
sleeps a lot, it will get even less CPU time on an average than it would if
there was no contention from the u=1 task.
And also CGroups will be even more weird than it already is in such a world,
2 different containers will not get CPU time distributed properly- say if
tasks in one container sleep a lot and tasks in another container are CPU
bound.
thanks,
- Joel
next prev parent reply other threads:[~2023-04-04 13:50 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-28 9:26 [PATCH 00/17] sched: EEVDF using latency-nice Peter Zijlstra
2023-03-28 9:26 ` [PATCH 01/17] sched: Introduce latency-nice as a per-task attribute Peter Zijlstra
2023-03-28 9:26 ` [PATCH 02/17] sched/fair: Add latency_offset Peter Zijlstra
2023-03-28 9:26 ` [PATCH 03/17] sched/fair: Add sched group latency support Peter Zijlstra
2023-03-28 9:26 ` [PATCH 04/17] sched/fair: Add avg_vruntime Peter Zijlstra
2023-03-28 23:57 ` Josh Don
2023-03-29 7:50 ` Peter Zijlstra
2023-04-05 19:13 ` Peter Zijlstra
2023-03-28 9:26 ` [PATCH 05/17] sched/fair: Remove START_DEBIT Peter Zijlstra
2023-03-28 9:26 ` [PATCH 06/17] sched/fair: Add lag based placement Peter Zijlstra
2023-04-03 9:18 ` Chen Yu
2023-04-05 9:47 ` Peter Zijlstra
2023-04-06 3:03 ` Chen Yu
2023-04-13 15:42 ` Chen Yu
2023-04-13 15:55 ` Chen Yu
2023-03-28 9:26 ` [PATCH 07/17] rbtree: Add rb_add_augmented_cached() helper Peter Zijlstra
2023-03-28 9:26 ` [PATCH 08/17] sched/fair: Implement an EEVDF like policy Peter Zijlstra
2023-03-29 1:26 ` Josh Don
2023-03-29 8:02 ` Peter Zijlstra
2023-03-29 8:06 ` Peter Zijlstra
2023-03-29 8:22 ` Peter Zijlstra
2023-03-29 18:48 ` Josh Don
2023-03-29 8:12 ` Peter Zijlstra
2023-03-29 18:54 ` Josh Don
2023-03-29 8:18 ` Peter Zijlstra
2023-03-29 14:35 ` Vincent Guittot
2023-03-30 8:01 ` Peter Zijlstra
2023-03-30 17:05 ` Vincent Guittot
2023-04-04 12:00 ` Peter Zijlstra
2023-03-28 9:26 ` [PATCH 09/17] sched: Commit to lag based placement Peter Zijlstra
2023-03-28 9:26 ` [PATCH 10/17] sched/smp: Use lag to simplify cross-runqueue placement Peter Zijlstra
2023-03-28 9:26 ` [PATCH 11/17] sched: Commit to EEVDF Peter Zijlstra
2023-03-28 9:26 ` [PATCH 12/17] sched/debug: Rename min_granularity to base_slice Peter Zijlstra
2023-03-28 9:26 ` [PATCH 13/17] sched: Merge latency_offset into slice Peter Zijlstra
2023-03-28 9:26 ` [PATCH 14/17] sched/eevdf: Better handle mixed slice length Peter Zijlstra
2023-03-31 15:26 ` Vincent Guittot
2023-04-04 9:29 ` Peter Zijlstra
2023-04-04 13:50 ` Joel Fernandes [this message]
2023-04-05 5:41 ` Mike Galbraith
2023-04-05 8:35 ` Peter Zijlstra
2023-04-05 20:05 ` Joel Fernandes
2023-04-14 11:18 ` Phil Auld
2023-04-16 5:10 ` Joel Fernandes
[not found] ` <20230401232355.336-1-hdanton@sina.com>
2023-04-02 2:40 ` Mike Galbraith
2023-03-28 9:26 ` [PATCH 15/17] [RFC] sched/eevdf: Sleeper bonus Peter Zijlstra
2023-03-29 9:10 ` Mike Galbraith
2023-03-28 9:26 ` [PATCH 16/17] [RFC] sched/eevdf: Minimal vavg option Peter Zijlstra
2023-03-28 9:26 ` [PATCH 17/17] [DEBUG] sched/eevdf: Debug / validation crud Peter Zijlstra
2023-04-03 7:42 ` [PATCH 00/17] sched: EEVDF using latency-nice Shrikanth Hegde
2023-04-10 3:13 ` David Vernet
2023-04-11 2:09 ` David Vernet
[not found] ` <20230410082307.1327-1-hdanton@sina.com>
2023-04-11 10:15 ` Mike Galbraith
[not found] ` <20230411133333.1790-1-hdanton@sina.com>
2023-04-11 14:56 ` Mike Galbraith
[not found] ` <20230412025042.1413-1-hdanton@sina.com>
2023-04-12 4:05 ` Mike Galbraith
2023-04-25 12:32 ` Phil Auld
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230404135050.GA471948@google.com \
--to=joel@joelfernandes.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=chris.hyser@oracle.com \
--cc=corbet@lwn.net \
--cc=dietmar.eggemann@arm.com \
--cc=efault@gmx.de \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=patrick.bellasi@matbug.net \
--cc=pavel@ucw.cz \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=qperret@google.com \
--cc=qyousef@layalina.io \
--cc=rostedt@goodmis.org \
--cc=tim.c.chen@linux.intel.com \
--cc=timj@gnu.org \
--cc=vincent.guittot@linaro.org \
--cc=youssefesmat@chromium.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).