All of lore.kernel.org
 help / color / mirror / Atom feed
* [problem] Hung task caused by memory migration when cpuset.mems changes
@ 2024-03-25 14:46 Chuyi Zhou
  2024-03-26 17:26 ` Tejun Heo
  0 siblings, 1 reply; 8+ messages in thread
From: Chuyi Zhou @ 2024-03-25 14:46 UTC (permalink / raw)
  To: cgroups, longman, tj, hughd
  Cc: wuyun.abel, hezhongkun.hzk, chenying.kernel, zhanghaoyu.zhy, Chuyi Zhou

In our production environment, we have observed several cases of hung tasks
blocked on the cgroup_mutex. The underlying cause is that when user modify
the cpuset.mems, memory migration operations are performed in the
work_queue. However, the duration of these operations depends on the memory
size of workloads and can consume a significant amount of time.

In the __cgroup_procs_write operation, there is a flush_workqueue operation
that waits for the migration to complete while holding the cgroup_mutex.
As a result, most cgroup-related operations have the potential to
experience blocking.

We have noticed the commit "cgroup/cpuset: Enable memory migration for
cpuset v2"[1]. This commit enforces memory migration when modifying the
cpuset. Furthermore, in cgroup v2, there is no option available for
users to disable CS_MEMORY_MIGRATE.

In our scenario, we do need to perform memory migration when cpuset.mems
changes, while ensuring that other tasks are not blocked on cgroup_mutex
for an extended period of time.

One feasible approach is to revert the commit "cgroup/cpuset: Enable memory
migration for cpuset v2"[1]. This way, modifying cpuset.mems will not
trigger memory migration, and we can manually perform memory migration
using migrate_pages()/move_pages() syscalls.

Another solution is to use a lazy approach for memory migration[2]. In
this way we only walk through all the pages and sets pages to protnone,
and numa faults triggered by later touch will handle the movement. That
would significantly reduce the time spent in cpuset_migrate_mm_workfn.
But MPOL_MF_LAZY was disabled by commit 2cafb582173f ("mempolicy: remove
confusing MPOL_MF_LAZY dead code")

Do you have any better suggestions?

Thanks.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ee9707e8593dfb9a375cf4793c3fd03d4142b463
[2] https://lore.kernel.org/lkml/20210426065946.40491-1-wuyun.abel@bytedance.com/T/



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-28 17:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-25 14:46 [problem] Hung task caused by memory migration when cpuset.mems changes Chuyi Zhou
2024-03-26 17:26 ` Tejun Heo
2024-03-27 14:07   ` Chuyi Zhou
2024-03-27 16:13     ` Tejun Heo
2024-03-27 17:14   ` Waiman Long
2024-03-27 21:43     ` Tejun Heo
2024-03-28  7:53   ` Abel Wu
2024-03-28 17:19     ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.