All of lore.kernel.org
 help / color / mirror / Atom feed
From: Subhra Mazumdar <subhra.mazumdar@oracle.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	daniel.lezcano@linaro.org, steven.sistare@oracle.com,
	dhaval.giani@oracle.com, rohit.k.jain@oracle.com
Subject: Re: [PATCH 1/3] sched: remove select_idle_core() for scalability
Date: Tue, 24 Apr 2018 14:45:50 -0700	[thread overview]
Message-ID: <df8f4982-908a-23e6-5148-4d4e91d38be4@oracle.com> (raw)
In-Reply-To: <20180424124621.GQ4082@hirez.programming.kicks-ass.net>



On 04/24/2018 05:46 AM, Peter Zijlstra wrote:
> On Mon, Apr 23, 2018 at 05:41:14PM -0700, subhra mazumdar wrote:
>> select_idle_core() can potentially search all cpus to find the fully idle
>> core even if there is one such core. Removing this is necessary to achieve
>> scalability in the fast path.
> So this removes the whole core awareness from the wakeup path; this
> needs far more justification.
>
> In general running on pure cores is much faster than running on threads.
> If you plot performance numbers there's almost always a fairly
> significant drop in slope at the moment when we run out of cores and
> start using threads.
The only justification I have is the benchmarks I ran all most all
improved, most importantly our internal Oracle DB tests which we care about
a lot. So what you said makes sense in theory but is not borne out by real
world results. This indicates that threads of these benchmarks care more
about running immediately on any idle cpu rather than spending time to find
fully idle core to run on.
> Also, depending on cpu enumeration, your next patch might not even leave
> the core scanning for idle CPUs.
>
> Now, typically on Intel systems, we first enumerate cores and then
> siblings, but I've seen Intel systems that don't do this and enumerate
> all threads together. Also other architectures are known to iterate full
> cores together, both s390 and Power for example do this.
>
> So by only doing a linear scan on CPU number you will actually fill
> cores instead of equally spreading across cores. Worse still, by
> limiting the scan to _4_ you only barely even get onto a next core for
> SMT4 hardware, never mind SMT8.
Again this doesn't matter for the benchmarks I ran. Most are happy to make
the tradeoff on x86 (SMT2). Limiting the scan is mitigated by the fact that
the scan window is rotated over all cpus, so idle cpus will be found soon.
There is also stealing by idle cpus. Also this was an RFT so I request this
to be tested on other architectrues like SMT4/SMT8.
>
> So while I'm not adverse to limiting the empty core search; I do feel it
> is important to have. Overloading cores when you don't have to is not
> good.
Can we have a config or a way for enabling/disabling select_idle_core?

  reply	other threads:[~2018-04-24 21:43 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-24  0:41 [RFC/RFT PATCH 0/3] Improve scheduler scalability for fast path subhra mazumdar
2018-04-24  0:41 ` [PATCH 1/3] sched: remove select_idle_core() for scalability subhra mazumdar
2018-04-24 12:46   ` Peter Zijlstra
2018-04-24 21:45     ` Subhra Mazumdar [this message]
2018-04-25 17:49       ` Peter Zijlstra
2018-04-30 23:38         ` Subhra Mazumdar
2018-05-01 18:03           ` Peter Zijlstra
2018-05-02 21:58             ` Subhra Mazumdar
2018-05-04 18:51               ` Subhra Mazumdar
2018-05-29 21:36               ` Peter Zijlstra
2018-05-30 22:08                 ` Subhra Mazumdar
2018-05-31  9:26                   ` Peter Zijlstra
2018-04-24  0:41 ` [PATCH 2/3] sched: introduce per-cpu var next_cpu to track search limit subhra mazumdar
2018-04-24 12:47   ` Peter Zijlstra
2018-04-24 22:39     ` Subhra Mazumdar
2018-04-24  0:41 ` [PATCH 3/3] sched: limit cpu search and rotate search window for scalability subhra mazumdar
2018-04-24 12:48   ` Peter Zijlstra
2018-04-24 22:43     ` Subhra Mazumdar
2018-04-24 12:48   ` Peter Zijlstra
2018-04-24 22:48     ` Subhra Mazumdar
2018-04-24 12:53   ` Peter Zijlstra
2018-04-25  0:10     ` Subhra Mazumdar
2018-04-25 15:36       ` Peter Zijlstra
2018-04-25 18:01         ` Peter Zijlstra
2018-05-04  2:46   ` [lkp-robot] [sched] 9824134a55: hackbench.throughput +85.7% improvement kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df8f4982-908a-23e6-5148-4d4e91d38be4@oracle.com \
    --to=subhra.mazumdar@oracle.com \
    --cc=daniel.lezcano@linaro.org \
    --cc=dhaval.giani@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rohit.k.jain@oracle.com \
    --cc=steven.sistare@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.