linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] cgroup/cpuset: Enable memory migration for cpuset v2
@ 2021-08-11 16:30 Waiman Long
  2021-08-11 18:48 ` Johannes Weiner
  0 siblings, 1 reply; 3+ messages in thread
From: Waiman Long @ 2021-08-11 16:30 UTC (permalink / raw)
  To: Tejun Heo, Zefan Li, Johannes Weiner, Jonathan Corbet
  Cc: cgroups, linux-kernel, linux-doc, Waiman Long

When a user changes cpuset.cpus, each task in a v2 cpuset will be moved
to one of the new cpus if it is not there already. For memory, however,
they won't be migrated to the new nodes when cpuset.mems changes. This is
an inconsistency in behavior.

In cpuset v1, there is a memory_migrate control file to enable such
behavior by setting the CS_MEMORY_MIGRATE flag. Make it the default
for cpuset v2 so that we have a consistent set of behavior for both
cpus and memory.

There is certainly a cost to make memory migration the default, but it
is a one time cost that shouldn't really matter as long as cpuset.mems
isn't changed frequenty.  Update the cgroup-v2.rst file to document the
new behavior and recommend against changing cpuset.mems frequently.

Since there won't be any concurrent access to the newly allocated cpuset
structure in cpuset_css_alloc(), we can use the cheaper non-atomic
__set_bit() instead of the more expensive atomic set_bit().

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/admin-guide/cgroup-v2.rst | 6 ++++++
 kernel/cgroup/cpuset.c                  | 6 +++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 5c7377b5bd3e..47c832ad1322 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -2056,6 +2056,12 @@ Cpuset Interface Files
 	The value of "cpuset.mems" stays constant until the next update
 	and won't be affected by any memory nodes hotplug events.
 
+	Setting a non-empty value to "cpuset.mems" causes memory of
+	tasks within the cgroup to be migrated to the designated nodes if
+	they are currently using memory outside of the designated nodes.
+	There is a cost for this migration.  So "cpuset.mems" shouldn't
+	be changed frequently.
+
   cpuset.mems.effective
 	A read-only multiple values file which exists on all
 	cpuset-enabled cgroups.
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index adb5190c4429..d151e1de93d4 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -2737,12 +2737,16 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
 		return ERR_PTR(-ENOMEM);
 	}
 
-	set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
+	__set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
 	nodes_clear(cs->mems_allowed);
 	nodes_clear(cs->effective_mems);
 	fmeter_init(&cs->fmeter);
 	cs->relax_domain_level = -1;
 
+	/* Set CS_MEMORY_MIGRATE for default hierarchy */
+	if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys))
+		__set_bit(CS_MEMORY_MIGRATE, &cs->flags);
+
 	return &cs->css;
 }
 
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] cgroup/cpuset: Enable memory migration for cpuset v2
  2021-08-11 16:30 [PATCH] cgroup/cpuset: Enable memory migration for cpuset v2 Waiman Long
@ 2021-08-11 18:48 ` Johannes Weiner
  2021-08-11 19:06   ` Waiman Long
  0 siblings, 1 reply; 3+ messages in thread
From: Johannes Weiner @ 2021-08-11 18:48 UTC (permalink / raw)
  To: Waiman Long
  Cc: Tejun Heo, Zefan Li, Jonathan Corbet, cgroups, linux-kernel, linux-doc

On Wed, Aug 11, 2021 at 12:30:35PM -0400, Waiman Long wrote:
> When a user changes cpuset.cpus, each task in a v2 cpuset will be moved
> to one of the new cpus if it is not there already. For memory, however,
> they won't be migrated to the new nodes when cpuset.mems changes. This is
> an inconsistency in behavior.
> 
> In cpuset v1, there is a memory_migrate control file to enable such
> behavior by setting the CS_MEMORY_MIGRATE flag. Make it the default
> for cpuset v2 so that we have a consistent set of behavior for both
> cpus and memory.
> 
> There is certainly a cost to make memory migration the default, but it
> is a one time cost that shouldn't really matter as long as cpuset.mems
> isn't changed frequenty.  Update the cgroup-v2.rst file to document the
> new behavior and recommend against changing cpuset.mems frequently.
> 
> Since there won't be any concurrent access to the newly allocated cpuset
> structure in cpuset_css_alloc(), we can use the cheaper non-atomic
> __set_bit() instead of the more expensive atomic set_bit().
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  Documentation/admin-guide/cgroup-v2.rst | 6 ++++++
>  kernel/cgroup/cpuset.c                  | 6 +++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> index 5c7377b5bd3e..47c832ad1322 100644
> --- a/Documentation/admin-guide/cgroup-v2.rst
> +++ b/Documentation/admin-guide/cgroup-v2.rst
> @@ -2056,6 +2056,12 @@ Cpuset Interface Files
>  	The value of "cpuset.mems" stays constant until the next update
>  	and won't be affected by any memory nodes hotplug events.
>  
> +	Setting a non-empty value to "cpuset.mems" causes memory of
> +	tasks within the cgroup to be migrated to the designated nodes if
> +	they are currently using memory outside of the designated nodes.
> +	There is a cost for this migration.  So "cpuset.mems" shouldn't
> +	be changed frequently.

The migration skips over pages that are (temporarily) off the LRU for
reclaim, compaction etc. so it can leave random pages behind.

In practice it's probably fine, but it probably makes sense to say
it's advisable to set this config once before the workload launches
for best results, and not rely too much on changing things around
post-hoc, due to cost you pointed out but also due to reliability.

Otherwise no objection from me.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] cgroup/cpuset: Enable memory migration for cpuset v2
  2021-08-11 18:48 ` Johannes Weiner
@ 2021-08-11 19:06   ` Waiman Long
  0 siblings, 0 replies; 3+ messages in thread
From: Waiman Long @ 2021-08-11 19:06 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Tejun Heo, Zefan Li, Jonathan Corbet, cgroups, linux-kernel, linux-doc

On 8/11/21 2:48 PM, Johannes Weiner wrote:
> On Wed, Aug 11, 2021 at 12:30:35PM -0400, Waiman Long wrote:
>> When a user changes cpuset.cpus, each task in a v2 cpuset will be moved
>> to one of the new cpus if it is not there already. For memory, however,
>> they won't be migrated to the new nodes when cpuset.mems changes. This is
>> an inconsistency in behavior.
>>
>> In cpuset v1, there is a memory_migrate control file to enable such
>> behavior by setting the CS_MEMORY_MIGRATE flag. Make it the default
>> for cpuset v2 so that we have a consistent set of behavior for both
>> cpus and memory.
>>
>> There is certainly a cost to make memory migration the default, but it
>> is a one time cost that shouldn't really matter as long as cpuset.mems
>> isn't changed frequenty.  Update the cgroup-v2.rst file to document the
>> new behavior and recommend against changing cpuset.mems frequently.
>>
>> Since there won't be any concurrent access to the newly allocated cpuset
>> structure in cpuset_css_alloc(), we can use the cheaper non-atomic
>> __set_bit() instead of the more expensive atomic set_bit().
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   Documentation/admin-guide/cgroup-v2.rst | 6 ++++++
>>   kernel/cgroup/cpuset.c                  | 6 +++++-
>>   2 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index 5c7377b5bd3e..47c832ad1322 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -2056,6 +2056,12 @@ Cpuset Interface Files
>>   	The value of "cpuset.mems" stays constant until the next update
>>   	and won't be affected by any memory nodes hotplug events.
>>   
>> +	Setting a non-empty value to "cpuset.mems" causes memory of
>> +	tasks within the cgroup to be migrated to the designated nodes if
>> +	they are currently using memory outside of the designated nodes.
>> +	There is a cost for this migration.  So "cpuset.mems" shouldn't
>> +	be changed frequently.
> The migration skips over pages that are (temporarily) off the LRU for
> reclaim, compaction etc. so it can leave random pages behind.
>
> In practice it's probably fine, but it probably makes sense to say
> it's advisable to set this config once before the workload launches
> for best results, and not rely too much on changing things around
> post-hoc, due to cost you pointed out but also due to reliability.
>
> Otherwise no objection from me.
>
Thanks for the suggestion. Will update the patch to include additional 
wording about that.

Cheers,
Longman


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-11 19:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 16:30 [PATCH] cgroup/cpuset: Enable memory migration for cpuset v2 Waiman Long
2021-08-11 18:48 ` Johannes Weiner
2021-08-11 19:06   ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).