From: Benjamin Segall <bsegall@google.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>,
peterz@infradead.org, bristot@redhat.com,
dietmar.eggemann@arm.com, joshdon@google.com,
juri.lelli@redhat.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
linux@rasmusvillemoes.dk, mgorman@suse.de, mingo@kernel.org,
rostedt@goodmis.org, valentin.schneider@arm.com,
vincent.guittot@linaro.org
Subject: Re: [PATCH 1/1] sched/fair: improve yield_to vs fairness
Date: Tue, 27 Jul 2021 11:57:13 -0700 [thread overview]
Message-ID: <xm2635rza8l2.fsf@google.com> (raw)
In-Reply-To: <1acd7520-bd4b-d43d-302a-8dcacf6defa5@de.ibm.com> (Christian Borntraeger's message of "Mon, 26 Jul 2021 20:41:15 +0200")
Christian Borntraeger <borntraeger@de.ibm.com> writes:
> On 23.07.21 18:21, Mel Gorman wrote:
>> On Fri, Jul 23, 2021 at 02:36:21PM +0200, Christian Borntraeger wrote:
>>>> sched: Do not select highest priority task to run if it should be skipped
>>>>
>>>> <SNIP>
>>>>
>>>> index 44c452072a1b..ddc0212d520f 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -4522,7 +4522,8 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>>>> se = second;
>>>> }
>>>> - if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) {
>>>> + if (cfs_rq->next &&
>>>> + (cfs_rq->skip == left || wakeup_preempt_entity(cfs_rq->next, left) < 1)) {
>>>> /*
>>>> * Someone really wants this to run. If it's not unfair, run it.
>>>> */
>>>>
>>>
>>> I do see a reduction in ignored yields, but from a performance aspect for my
>>> testcases this patch does not provide a benefit, while the the simple
>>> curr->vruntime += sysctl_sched_min_granularity;
>>> does.
>> I'm still not a fan because vruntime gets distorted. From the docs
>> Small detail: on "ideal" hardware, at any time all tasks would have the
>> same
>> p->se.vruntime value --- i.e., tasks would execute simultaneously and no task
>> would ever get "out of balance" from the "ideal" share of CPU time
>> If yield_to impacts this "ideal share" then it could have other
>> consequences.
>> I think your patch may be performing better in your test case because every
>> "wrong" task selected that is not the yield_to target gets penalised and
>> so the yield_to target gets pushed up the list.
>>
>>> I still think that your approach is probably the cleaner one, any chance to improve this
>>> somehow?
>>>
>> Potentially. The patch was a bit off because while it noticed that skip
>> was not being obeyed, the fix was clumsy and isolated. The current flow is
>> 1. pick se == left as the candidate
>> 2. try pick a different se if the "ideal" candidate is a skip candidate
>> 3. Ignore the se update if next or last are set
>> Step 3 looks off because it ignores skip if next or last buddies are set
>> and I don't think that was intended. Can you try this?
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 44c452072a1b..d56f7772a607 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4522,12 +4522,12 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
>> se = second;
>> }
>> - if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, left) < 1) {
>> + if (cfs_rq->next && wakeup_preempt_entity(cfs_rq->next, se) < 1) {
>> /*
>> * Someone really wants this to run. If it's not unfair, run it.
>> */
>> se = cfs_rq->next;
>> - } else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, left) < 1) {
>> + } else if (cfs_rq->last && wakeup_preempt_entity(cfs_rq->last, se) < 1) {
>> /*
>> * Prefer last buddy, try to return the CPU to a preempted task.
>> */
>>
>
> This one alone does not seem to make a difference. Neither in ignored yield, nor
> in performance.
>
> Your first patch does really help in terms of ignored yields when
> all threads are pinned to one host CPU. After that we do have no ignored yield
> it seems. But it does not affect the performance of my testcase.
> I did some more experiments and I removed the wakeup_preempt_entity checks in
> pick_next_entity - assuming that this will result in source always being stopped
> and target always being picked. But still, no performance difference.
> As soon as I play with vruntime I do see a difference (but only without the cpu cgroup
> controller). I will try to better understand the scheduler logic and do some more
> testing. If you have anything that I should test, let me know.
>
> Christian
If both yielder and target are in the same cpu cgroup or the cpu cgroup
is disabled (ie, if cfs_rq_of(p->se) matches), you could try
if (p->se.vruntime > rq->curr->se.vruntime)
swap(p->se.vruntime, rq->curr->se.vruntime)
as well as the existing buddy flags, as an entirely fair vruntime boost
to the target.
For when they aren't direct siblings, you /could/ use find_matching_se,
but it's much less clear that's desirable, since it would yield vruntime
for the entire hierarchy to the target's hierarchy.
next prev parent reply other threads:[~2021-07-27 18:57 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-12 10:14 [PATCH v2 0/9] sched: Clean up SCHED_DEBUG Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 1/9] sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 2/9] sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG Peter Zijlstra
2021-04-16 15:53 ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 3/9] sched: Dont make LATENCYTOP select SCHED_DEBUG Peter Zijlstra
2021-04-16 15:53 ` [tip: sched/core] sched: Don't " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 4/9] sched: Move SCHED_DEBUG sysctl to debugfs Peter Zijlstra
2021-04-15 16:29 ` [PATCH] sched/debug: Rename the sched_debug parameter to sched_debug_verbose Peter Zijlstra
2021-04-19 19:26 ` Josh Don
2021-04-16 15:53 ` [tip: sched/core] sched: Move SCHED_DEBUG sysctl to debugfs tip-bot2 for Peter Zijlstra
2021-04-27 14:59 ` Christian Borntraeger
2021-04-27 15:09 ` Steven Rostedt
2021-04-27 15:17 ` Christian Borntraeger
2021-04-28 8:47 ` Peter Zijlstra
2021-04-28 8:46 ` Peter Zijlstra
2021-04-28 8:54 ` Christian Borntraeger
2021-04-28 8:58 ` Christian Borntraeger
2021-04-28 9:25 ` Peter Zijlstra
2021-04-28 9:31 ` Christian Borntraeger
2021-04-28 9:42 ` Christian Borntraeger
2021-04-28 12:38 ` Peter Zijlstra
2021-04-28 14:49 ` Christian Borntraeger
2021-07-07 12:34 ` [PATCH 0/1] Improve yield (was: sched: Move SCHED_DEBUG sysctl to debugfs) Christian Borntraeger
2021-07-07 12:34 ` [PATCH 1/1] sched/fair: improve yield_to vs fairness Christian Borntraeger
2021-07-07 18:07 ` kernel test robot
2021-07-23 9:35 ` Mel Gorman
2021-07-23 12:36 ` Christian Borntraeger
2021-07-23 16:21 ` Mel Gorman
2021-07-26 18:41 ` Christian Borntraeger
2021-07-26 19:32 ` Mel Gorman
2021-07-27 6:59 ` Christian Borntraeger
2021-07-27 18:57 ` Benjamin Segall [this message]
2021-07-28 16:23 ` Christian Borntraeger
2021-08-10 8:49 ` Vincent Guittot
2021-07-27 13:29 ` Peter Zijlstra
2021-07-27 13:33 ` Peter Zijlstra
2021-07-27 14:31 ` Mel Gorman
2021-04-12 10:14 ` [PATCH v2 5/9] sched,preempt: Move preempt_dynamic to debug.c Peter Zijlstra
2021-04-16 15:53 ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 6/9] debugfs: Implement debugfs_create_str() Peter Zijlstra
2021-04-16 15:53 ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 7/9] sched,debug: Convert sysctl sched_domains to debugfs Peter Zijlstra
2021-04-13 14:55 ` Valentin Schneider
2021-04-15 9:06 ` Peter Zijlstra
2021-04-15 12:16 ` Dietmar Eggemann
2021-04-15 12:34 ` Valentin Schneider
2021-04-15 13:02 ` Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 8/9] sched: Move /proc/sched_debug " Peter Zijlstra
2021-04-16 15:53 ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
2021-04-12 10:14 ` [PATCH v2 9/9] sched,fair: Alternative sched_slice() Peter Zijlstra
2021-04-12 10:26 ` Peter Zijlstra
2021-04-16 15:53 ` [tip: sched/core] " tip-bot2 for Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xm2635rza8l2.fsf@google.com \
--to=bsegall@google.com \
--cc=borntraeger@de.ibm.com \
--cc=bristot@redhat.com \
--cc=dietmar.eggemann@arm.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=mgorman@suse.de \
--cc=mgorman@techsingularity.net \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=valentin.schneider@arm.com \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).