All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Tejun Heo <tj@kernel.org>, Feng Tang <feng.tang@intel.com>
Cc: Zefan Li <lizefan.x@bytedance.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	cgroups@vger.kernel.org, linux-mm@kvack.org,
	Dave Hansen <dave.hansen@intel.com>,
	ying.huang@intel.com
Subject: Re: [RFC PATCH] cgroup/cpuset: fix a memory binding failure for cgroup v2
Date: Sun, 24 Apr 2022 12:04:40 -0400	[thread overview]
Message-ID: <e278054c-098a-4859-4cef-2509f77ef0ca@redhat.com> (raw)
In-Reply-To: <YmHZK+M470GjeJCV@slm.duckdns.org>

On 4/21/22 18:22, Tejun Heo wrote:
> cc'ing Waiman and copying the whole body.
>
> Waiman, can you please take a look?
>
> Thanks.
>
> On Tue, Apr 19, 2022 at 10:09:58AM +0800, Feng Tang wrote:
>> We got report that setting cpuset.mems failed when the nodemask
>> contains a newly onlined memory node (not enumerated during boot)
>> for cgroup v2, while the binding succeeded for cgroup v1.
>>
>> The root cause is, for cgroup v2, when a new memory node is onlined,
>> top_cpuset's 'mem_allowed' is not updated with the new nodemask of
>> memory nodes, and the following setting memory nodemask will fail,
>> if the nodemask contains a new node.
>>
>> Fix it by updating top_cpuset.mems_allowed right after the
>> new memory node is onlined, just like v1.
>>
>> Signed-off-by: Feng Tang <feng.tang@intel.com>
>> ---
>> Very likely I missed some details here, but it looks strange that
>> the top_cpuset.mem_allowed is not updatd even after we onlined
>> several memory nodes after boot.
>>
>>   kernel/cgroup/cpuset.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index 9390bfd9f1cd..b97caaf16374 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -3314,8 +3314,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
>>   	/* synchronize mems_allowed to N_MEMORY */
>>   	if (mems_updated) {
>>   		spin_lock_irq(&callback_lock);
>> -		if (!on_dfl)
>> -			top_cpuset.mems_allowed = new_mems;
>> +		top_cpuset.mems_allowed = new_mems;
>>   		top_cpuset.effective_mems = new_mems;
>>   		spin_unlock_irq(&callback_lock);
>>   		update_tasks_nodemask(&top_cpuset);
The on_dfl check was added by commit 7e88291beefb ("cpuset: make 
cs->{cpus, mems}_allowed as user-configured masks"). This is the 
expected behavior for cgroup v2 as we don't want to remove a node 
because it is hot-removed. However, I do see a problem in case we are 
adding a node that is not originally in top_cpuset.mems_allowed. We 
should be allowed to add the extra memory node. So something like

         if (!on_dfl)
                 top_cpuset.mems_allowed = new_mems;
         else if (!nodes_subset(new_mems, top_cpuset.mems_allowed))
                 nodes_or(top_cpuset.mems_allowed, 
top_cpuset.mems_allowed, new_mems);

For v2, top_cpuset.mems_allowed is set to node_possible_map in 
cpuset_bind(). Perhaps node_possible_map may not include all the nodes 
that are hot-pluggable.

I don't know if that is similar problem with cpu_possible_mask or not.

Cheers,
Longman



WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <longman-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Feng Tang <feng.tang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	Dave Hansen <dave.hansen-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	ying.huang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org
Subject: Re: [RFC PATCH] cgroup/cpuset: fix a memory binding failure for cgroup v2
Date: Sun, 24 Apr 2022 12:04:40 -0400	[thread overview]
Message-ID: <e278054c-098a-4859-4cef-2509f77ef0ca@redhat.com> (raw)
In-Reply-To: <YmHZK+M470GjeJCV-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>

On 4/21/22 18:22, Tejun Heo wrote:
> cc'ing Waiman and copying the whole body.
>
> Waiman, can you please take a look?
>
> Thanks.
>
> On Tue, Apr 19, 2022 at 10:09:58AM +0800, Feng Tang wrote:
>> We got report that setting cpuset.mems failed when the nodemask
>> contains a newly onlined memory node (not enumerated during boot)
>> for cgroup v2, while the binding succeeded for cgroup v1.
>>
>> The root cause is, for cgroup v2, when a new memory node is onlined,
>> top_cpuset's 'mem_allowed' is not updated with the new nodemask of
>> memory nodes, and the following setting memory nodemask will fail,
>> if the nodemask contains a new node.
>>
>> Fix it by updating top_cpuset.mems_allowed right after the
>> new memory node is onlined, just like v1.
>>
>> Signed-off-by: Feng Tang <feng.tang-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
>> ---
>> Very likely I missed some details here, but it looks strange that
>> the top_cpuset.mem_allowed is not updatd even after we onlined
>> several memory nodes after boot.
>>
>>   kernel/cgroup/cpuset.c | 3 +--
>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index 9390bfd9f1cd..b97caaf16374 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -3314,8 +3314,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
>>   	/* synchronize mems_allowed to N_MEMORY */
>>   	if (mems_updated) {
>>   		spin_lock_irq(&callback_lock);
>> -		if (!on_dfl)
>> -			top_cpuset.mems_allowed = new_mems;
>> +		top_cpuset.mems_allowed = new_mems;
>>   		top_cpuset.effective_mems = new_mems;
>>   		spin_unlock_irq(&callback_lock);
>>   		update_tasks_nodemask(&top_cpuset);
The on_dfl check was added by commit 7e88291beefb ("cpuset: make 
cs->{cpus, mems}_allowed as user-configured masks"). This is the 
expected behavior for cgroup v2 as we don't want to remove a node 
because it is hot-removed. However, I do see a problem in case we are 
adding a node that is not originally in top_cpuset.mems_allowed. We 
should be allowed to add the extra memory node. So something like

         if (!on_dfl)
                 top_cpuset.mems_allowed = new_mems;
         else if (!nodes_subset(new_mems, top_cpuset.mems_allowed))
                 nodes_or(top_cpuset.mems_allowed, 
top_cpuset.mems_allowed, new_mems);

For v2, top_cpuset.mems_allowed is set to node_possible_map in 
cpuset_bind(). Perhaps node_possible_map may not include all the nodes 
that are hot-pluggable.

I don't know if that is similar problem with cpu_possible_mask or not.

Cheers,
Longman


  reply	other threads:[~2022-04-24 16:04 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-19  2:09 [RFC PATCH] cgroup/cpuset: fix a memory binding failure for cgroup v2 Feng Tang
2022-04-19  2:09 ` Feng Tang
2022-04-21 22:22 ` Tejun Heo
2022-04-21 22:22   ` Tejun Heo
2022-04-24 16:04   ` Waiman Long [this message]
2022-04-24 16:04     ` Waiman Long
2022-04-24 23:06     ` Waiman Long
2022-04-24 23:06       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e278054c-098a-4859-4cef-2509f77ef0ca@redhat.com \
    --to=longman@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dave.hansen@intel.com \
    --cc=feng.tang@intel.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan.x@bytedance.com \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.