From: Aaron Lu <aaron.lu@intel.com>
To: David Vernet <void@manifault.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
<linux-kernel@vger.kernel.org>, <mingo@redhat.com>,
<juri.lelli@redhat.com>, <vincent.guittot@linaro.org>,
<rostedt@goodmis.org>, <dietmar.eggemann@arm.com>,
<bsegall@google.com>, <mgorman@suse.de>, <bristot@redhat.com>,
<vschneid@redhat.com>, <joshdon@google.com>,
<roman.gushchin@linux.dev>, <tj@kernel.org>,
<kernel-team@meta.com>
Subject: Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS
Date: Fri, 16 Jun 2023 08:53:38 +0800 [thread overview]
Message-ID: <20230616005338.GA115001@ziqianlu-dell> (raw)
In-Reply-To: <20230615232605.GB2915572@maniforge>
On Thu, Jun 15, 2023 at 06:26:05PM -0500, David Vernet wrote:
> Ok, it seems that the issue is that I wasn't creating enough netperf
> clients. I assumed that -n $(nproc) was sufficient. I was able to repro
Yes that switch is confusing.
> the contention on my 26 core / 52 thread skylake client as well:
>
>
> Thanks for the help in getting the repro on my end.
You are welcome.
> So yes, there is certainly a scalability concern to bear in mind for
> swqueue for LLCs with a lot of cores. If you have a lot of tasks quickly
> e.g. blocking and waking on futexes in a tight loop, I expect a similar
> issue would be observed.
>
> On the other hand, the issue did not occur on my 7950X. I also wasn't
Using netperf/UDP_RR?
> able to repro the contention on the Skylake if I ran with the default
> netperf workload rather than UDP_RR (even with the additional clients).
I also tried that on the 18cores/36threads/LLC Skylake and the contention
is indeed much smaller than UDP_RR:
7.30% 7.29% [kernel.vmlinux] [k] native_queued_spin_lock_slowpath
But I wouldn't say it's entirely gone. Also consider Skylake has a lot
fewer cores per LLC than later Intel servers like Icelake and Sapphire
Rapids and I expect things would be worse on those two machines.
> I didn't bother to take the mean of all of the throughput results
> between NO_SWQUEUE and SWQUEUE, but they looked roughly equal.
>
> So swqueue isn't ideal for every configuration, but I'll echo my
> sentiment from [0] that this shouldn't on its own necessarily preclude
> it from being merged given that it does help a large class of
> configurations and workloads, and it's disabled by default.
>
> [0]: https://lore.kernel.org/all/20230615000103.GC2883716@maniforge/
I was wondering: does it make sense to do some divide on machines with
big LLCs? Like converting the per-LLC swqueue to per-group swqueue where
the group can be made of ~8 cpus of the same LLC. This will have a
similar effect of reducing the number of CPUs in a single LLC so the
scalability issue can hopefully be fixed while at the same time, it
might still help some workloads. I realized this isn't ideal in that
wakeup happens at LLC scale so the group thing may not fit very well
here.
Just a thought, feel free to ignore it if you don't think this is
feasible :-)
Thanks,
Aaron
next prev parent reply other threads:[~2023-06-16 0:54 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-13 5:20 [RFC PATCH 0/3] sched: Implement shared wakequeue in CFS David Vernet
2023-06-13 5:20 ` [RFC PATCH 1/3] sched: Make migrate_task_to() take any task David Vernet
2023-06-21 13:04 ` Peter Zijlstra
2023-06-22 2:07 ` David Vernet
2023-06-13 5:20 ` [RFC PATCH 2/3] sched/fair: Add SWQUEUE sched feature and skeleton calls David Vernet
2023-06-21 12:49 ` Peter Zijlstra
2023-06-22 14:53 ` David Vernet
2023-06-13 5:20 ` [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS David Vernet
2023-06-13 8:32 ` Peter Zijlstra
2023-06-14 4:35 ` Aaron Lu
2023-06-14 9:27 ` Peter Zijlstra
2023-06-15 0:01 ` David Vernet
2023-06-15 4:49 ` Aaron Lu
2023-06-15 7:31 ` Aaron Lu
2023-06-15 23:26 ` David Vernet
2023-06-16 0:53 ` Aaron Lu [this message]
2023-06-20 17:36 ` David Vernet
2023-06-21 2:35 ` Aaron Lu
2023-06-21 2:43 ` David Vernet
2023-06-21 4:54 ` Aaron Lu
2023-06-21 5:43 ` David Vernet
2023-06-21 6:03 ` Aaron Lu
2023-06-22 15:57 ` Chris Mason
2023-06-13 8:41 ` Peter Zijlstra
2023-06-14 20:26 ` David Vernet
2023-06-16 8:08 ` Vincent Guittot
2023-06-20 19:54 ` David Vernet
2023-06-20 21:37 ` Roman Gushchin
2023-06-21 14:22 ` Peter Zijlstra
2023-06-19 6:13 ` Gautham R. Shenoy
2023-06-20 20:08 ` David Vernet
2023-06-21 8:17 ` Gautham R. Shenoy
2023-06-22 1:43 ` David Vernet
2023-06-22 9:11 ` Gautham R. Shenoy
2023-06-22 10:29 ` Peter Zijlstra
2023-06-23 9:50 ` Gautham R. Shenoy
2023-06-26 6:04 ` Gautham R. Shenoy
2023-06-27 3:17 ` David Vernet
2023-06-27 16:31 ` Chris Mason
2023-06-21 14:20 ` Peter Zijlstra
2023-06-21 20:34 ` David Vernet
2023-06-22 10:58 ` Peter Zijlstra
2023-06-22 14:43 ` David Vernet
2023-07-10 11:57 ` [RFC PATCH 0/3] " K Prateek Nayak
2023-07-11 4:43 ` David Vernet
2023-07-11 5:06 ` K Prateek Nayak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230616005338.GA115001@ziqianlu-dell \
--to=aaron.lu@intel.com \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=joshdon@google.com \
--cc=juri.lelli@redhat.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=roman.gushchin@linux.dev \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).