From: Valentin Schneider <valentin.schneider@arm.com>
To: Steven Sistare <steven.sistare@oracle.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, subhra.mazumdar@oracle.com,
dhaval.giani@oracle.com, daniel.m.jordan@oracle.com,
pavel.tatashin@microsoft.com, matt@codeblueprint.co.uk,
umgwanakikbuti@gmail.com, riel@redhat.com, jbacik@fb.com,
juri.lelli@redhat.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 00/10] steal tasks to improve CPU utilization
Date: Thu, 25 Oct 2018 12:31:12 +0100 [thread overview]
Message-ID: <09b10abc-8357-2db3-3d30-8aa9e95e8655@arm.com> (raw)
In-Reply-To: <abf3ae2a-a7f4-2524-0da6-09599928b47a@oracle.com>
On 24/10/2018 20:27, Steven Sistare wrote:
[...]
> Hi Valentin,
>
> Asymmetric systems could maintain a separate bitmap for misfits; set a bit
> when a CPU goes on CPU, clear it going off. When a fast CPU goes new idle,
> it would first search the misfits mask, then search cfs_overload_cpus.
> The misfits logic would be conditionalized with CONFIG or sched feat static
> branches so symmetric systems do not incur extra overhead.
>
That sounds reasonable - besides, misfit already introduces a
sched_asym_cpucapacity static key. I'll try to play around with that.
>> We'd also lose the NOHZ update done in idle_balance(), though I think it's
>> not such a big deal - were were piggy-backing this on idle_balance() just
>> because it happened to be convenient, and we still have NOHZ_STATS_KICK
>> anyway.
>
> Agreed.
>
>> Another thing - in your test cases, what is the most prevalent cause of
>> failure to pull a task in idle_balance()? Is it the load_balance() itself
>> that fails to find a task (e.g. because the imbalance is not deemed big
>> enough), or is it the idle migration cost logic that prevents
>> load_balance() from running to completion?
>
> The latter. Eg, for the test "X6-2, 40 CPUs, hackbench 3 process 50000",
> CPU avg_idle is 355566 nsec, and sched_migration_cost_ns = 500000,
> so idle_balance bails at the top:
> if (this_rq->avg_idle < sysctl_sched_migration_cost ||
> ...
> goto out
>
> For other tests, we get past that clause but bail from a domain:
> if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
> ...
> break;
>
>> In the first case, try_steal() makes perfect sense to me. In the second
>> case, I'm not sure if we really want to pull something if we know (well,
>> we *think*) we're about to resume the execution of some other task.
>
> 355.566 microsec is enough time to steal, go on CPU, do useful work, and go
> off CPU, particularly for chatty workloads like hackbench. The performance
> data bear this out. For the higher loads, the average timeslice for
> hackbench
>
Thanks for the explanation. AIUI the big difference here is that try_steal()
is considerably cheaper than load_balance(), so the rq->avg_idle concerns
matter less (or at least, on a considerably smaller scale).
> Perhaps I could skip try_steal() if avg_idle is very small, although with
> hackbench I have seen average time slice as small as 10 microsec under
> high load and preemptions. I'll run some experiments.
>
That might be a safe thing to do. In the same department, maybe we could
skip try_steal() if we bail out of idle_balance() because
!(this_rq->rd->overload). Although rq->rd->overload and cfs_overload_cpus
are decoupled, they should express the same thing here.
>>> We could merge the stealing code into the idle_balance() code to get a
>>> union of the two, but IMO that would be less readable.
>>>
>>> We could remove the core and socket levels from idle_balance()
>>
>> I understand that as only doing load_balance() at DIE level in
>> idle_balance(), as that is what makes most sense to me (with big.LITTLE
>> those misfit migrations are done at DIE level), is that correct?
>
> Correct.
>> Also, with DynamIQ (next gen big.LITTLE) we could have asymmetry at MC
>> level, which could cause issues there.
>
> We could keep idle_balance for this level and fall back to stealing as in
> my patch, or you could extend the misfits bitmap to also include CPUs
> with reduced memory bandwidth and active tasks. (if I understand the asymmetry
> correctly).
>
It's mostly µarch asymmetry, so by "asymmetry at MC level" I meant "we'll
see the SD_ASYM_CPUCAPACITY flag at MC level". But if we tweak stealing
to take misfit tasks into account (so we'd rely on SD_ASYM_CPUCAPACITY
in some way or another), that could work.
>>> and let
>>> stealing handle those levels. I think that makes sense after stealing
>>> performance is validated on more architectures, but we would still have
>>> two different mechanisms.
>>>
>>> - Steve
>>
>> I'll try out those patches on top of the misfit series to see how the
>> whole thing behaves.
>
> Very good, thanks.
>
> - Steve
>
next prev parent reply other threads:[~2018-10-25 11:31 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-22 14:59 [PATCH 00/10] steal tasks to improve CPU utilization Steve Sistare
2018-10-22 14:59 ` [PATCH 01/10] sched: Provide sparsemask, a reduced contention bitmap Steve Sistare
2018-10-22 14:59 ` [PATCH 02/10] sched/topology: Provide hooks to allocate data shared per LLC Steve Sistare
2018-10-22 14:59 ` [PATCH 03/10] sched/topology: Provide cfs_overload_cpus bitmap Steve Sistare
2018-10-22 14:59 ` [PATCH 04/10] sched/fair: Dynamically update cfs_overload_cpus Steve Sistare
2018-10-22 16:56 ` Peter Zijlstra
2018-10-22 18:43 ` Steven Sistare
2018-10-22 14:59 ` [PATCH 05/10] sched/fair: Hoist idle_stamp up from idle_balance Steve Sistare
2018-10-25 13:47 ` Valentin Schneider
2018-10-25 14:04 ` Steven Sistare
2018-10-22 14:59 ` [PATCH 06/10] sched/fair: Generalize the detach_task interface Steve Sistare
2018-10-22 14:59 ` [PATCH 07/10] sched/fair: Provide can_migrate_task_llc Steve Sistare
2018-10-26 18:04 ` Valentin Schneider
2018-10-26 18:28 ` Steven Sistare
2018-10-29 19:34 ` Valentin Schneider
2018-10-31 15:43 ` Steven Sistare
2018-10-31 18:48 ` Valentin Schneider
2018-10-31 19:14 ` Peter Zijlstra
2018-11-01 11:16 ` Valentin Schneider
2018-10-22 14:59 ` [PATCH 08/10] sched/fair: Steal work from an overloaded CPU when CPU goes idle Steve Sistare
2018-10-25 13:48 ` Valentin Schneider
2018-10-25 14:07 ` Steven Sistare
2018-10-22 14:59 ` [PATCH 09/10] sched/fair: disable stealing if too many NUMA nodes Steve Sistare
2018-10-22 17:06 ` Peter Zijlstra
2018-10-22 18:47 ` Steven Sistare
2018-10-22 19:21 ` Steven Sistare
2018-10-22 22:05 ` Peter Zijlstra
2018-10-23 13:18 ` Steven Sistare
2018-10-22 14:59 ` [PATCH 10/10] sched/fair: Provide idle search schedstats Steve Sistare
2018-10-22 17:04 ` [PATCH 00/10] steal tasks to improve CPU utilization Peter Zijlstra
2018-10-22 19:07 ` Steven Sistare
2018-10-22 22:09 ` Peter Zijlstra
2018-10-24 15:34 ` Valentin Schneider
2018-10-24 19:27 ` Steven Sistare
2018-10-25 11:31 ` Valentin Schneider [this message]
2018-10-25 12:21 ` Steven Sistare
2018-10-25 7:50 ` Vincent Guittot
2018-10-25 11:28 ` Steven Sistare
2018-10-25 12:43 ` Vincent Guittot
2018-10-25 14:19 ` Steven Sistare
2018-10-31 19:35 ` Steven Sistare
2018-11-01 11:56 ` Steven Sistare
2018-11-02 23:39 ` Subhra Mazumdar
2018-11-05 20:08 ` Steven Sistare
2019-01-04 13:37 ` Shijith Thotton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=09b10abc-8357-2db3-3d30-8aa9e95e8655@arm.com \
--to=valentin.schneider@arm.com \
--cc=daniel.m.jordan@oracle.com \
--cc=dhaval.giani@oracle.com \
--cc=jbacik@fb.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=mingo@redhat.com \
--cc=pavel.tatashin@microsoft.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=steven.sistare@oracle.com \
--cc=subhra.mazumdar@oracle.com \
--cc=umgwanakikbuti@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).