From: Peter Zijlstra <peterz@infradead.org>
To: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>,
linux-kernel@vger.kernel.org, kernel-team@android.com,
Zefan Li <lizefan.x@bytedance.com>, Tejun Heo <tj@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
cgroups@vger.kernel.org
Subject: Re: [PATCH 1/2] cpuset: Fix cpuset_cpus_allowed() to not filter offline CPUs
Date: Thu, 2 Feb 2023 20:42:03 +0100 [thread overview]
Message-ID: <Y9wSC1Wxlm8CKKlN@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <d630ca53-71f0-c735-fbc3-e826479aa86b@redhat.com>
On Thu, Feb 02, 2023 at 11:06:51AM -0500, Waiman Long wrote:
> After taking a close look at the patch, my understanding of what it is doing
> is as follows:
>
> v2: cpus_allowed will not be affected by hotplug. So the new
> cpuset_cpus_allowed() will return effective_cpus + offline cpus that should
> have been part of effective_cpus if online before masking it with allowable
> cpus and then go up the cpuset hierarchy if necessary.
>
> v1: cpus_allowed is equivalent to v2 effective_cpus. It starts at the
> current cpuset and move up the hierarchy if necessary to find a cpuset that
> have at least one allowable cpu.
>
> First of all, it does not take into account of the v2 partition feature that
> may cause it to produce incorrect result if partition is enabled somewhere.
How so? For a partition the cpus_allowed mask should be the parition
CPUs. The only magical bit about partitions is that any one CPU cannot
belong to two partitions and load-balancing is split.
> Secondly, I don't see any benefit other than having some additional offline
> cpu available in a task's cpumask which the scheduler will ignore anyway.
Those CPUs can come online again -- you're *again* dismissing the true
bug :/
If you filter out the offline CPUs at sched_setaffinity() time, you
forever lose those CPUs, the task will never again move to those CPUs,
even if they do come online after.
It is really simple to reproduce this:
- boot machine
- offline all CPUs except one
- taskset -p ffffffff $$
- online all CPUs
and observe your shell (and all its decendants) being stuck to the one
CPU. Do the same thing on a CPUSET=n build and note the difference (you
retain the full mask).
> v2 is able to recover a previously offlined cpu. So we don't gain any
> net benefit other than the going up the cpuset hierarchy part.
Only for !root tasks. Not even v2 will re-set the affinity of root tasks
afaict.
> For v1, I agree we should go up the cpuset hierarchy to find a usable
> cpuset. Instead of introducing such a complexity in cpuset_cpus_allowed(),
> my current preference is to do the hierarchy climbing part in an enhanced
> cpuset_cpus_allowed_fallback() after an initial failure of
> cpuset_cpus_allowed(). That will be easier to understand than having such
> complexity and overhead in cpuset_cpus_allowed() alone.
>
> I will work on a patchset to do that as a counter offer.
We will need a small and simple patch for /urgent, or I will need to
revert all your patches -- your call.
I also don't tihnk you fully appreciate the ramifications of
task_cpu_possible_mask(), cpuset currently gets that quite wrong.
next prev parent reply other threads:[~2023-02-02 19:42 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-31 22:17 [PATCH 0/2] Fix broken cpuset affinity handling on heterogeneous systems Will Deacon
2023-01-31 22:17 ` [PATCH 1/2] cpuset: Fix cpuset_cpus_allowed() to not filter offline CPUs Will Deacon
2023-02-01 4:14 ` Waiman Long
2023-02-01 9:14 ` Peter Zijlstra
2023-02-01 15:16 ` Waiman Long
2023-02-01 18:46 ` Waiman Long
2023-02-01 19:14 ` Waiman Long
2023-02-01 19:17 ` Waiman Long
2023-02-01 21:10 ` Peter Zijlstra
2023-02-02 3:34 ` Waiman Long
2023-02-03 11:50 ` Will Deacon
2023-02-03 15:13 ` Waiman Long
2023-02-03 15:26 ` Peter Zijlstra
2023-02-03 15:35 ` Waiman Long
2023-02-02 8:34 ` Peter Zijlstra
2023-02-02 16:06 ` Waiman Long
2023-02-02 19:42 ` Peter Zijlstra [this message]
2023-02-02 20:46 ` Waiman Long
2023-02-02 20:48 ` Tejun Heo
2023-02-02 20:53 ` Waiman Long
2023-02-02 21:05 ` Waiman Long
2023-02-02 21:50 ` Tejun Heo
2023-02-03 0:54 ` Waiman Long
2023-02-03 16:31 ` Will Deacon
2023-01-31 22:17 ` [PATCH 2/2] cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task Will Deacon
2023-02-01 2:22 ` Waiman Long
2023-02-01 9:15 ` Peter Zijlstra
2023-02-01 15:03 ` Waiman Long
2023-02-01 9:27 ` Peter Zijlstra
2023-02-03 17:55 ` Waiman Long
2023-02-06 20:21 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y9wSC1Wxlm8CKKlN@hirez.programming.kicks-ass.net \
--to=peterz@infradead.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kernel-team@android.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan.x@bytedance.com \
--cc=longman@redhat.com \
--cc=tj@kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).