linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] mm: memcontrol: recursive memory protection
@ 2019-12-13 19:21 Johannes Weiner
  2019-12-13 19:21 ` [PATCH 1/3] mm: memcontrol: fix memory.low proportional distribution Johannes Weiner
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Johannes Weiner @ 2019-12-13 19:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, Roman Gushchin, Tejun Heo, linux-mm, cgroups,
	linux-kernel, kernel-team

The current memory.low (and memory.min) semantics require protection
to be assigned to a cgroup in an untinterrupted chain from the
top-level cgroup all the way to the leaf.

In practice, we want to protect entire cgroup subtrees from each other
(system management software vs. workload), but we would like the VM to
balance memory optimally *within* each subtree, without having to make
explicit weight allocations among individual components. The current
semantics make that impossible.

This patch series extends memory.low/min such that the knobs apply
recursively to the entire subtree. Users can still assign explicit
protection to subgroups, but if they don't, the protection set by the
parent cgroup will be distributed dynamically such that children
compete freely - as if no memory control were enabled inside the
subtree - but enjoy protection from neighboring trees.

Patch #1 fixes an existing bug that can give a cgroup tree more
protection than it should receive as per ancestor configuration.

Patch #2 simplifies and documents the existing code to make it easier
to reason about the changes in the next patch.

Patch #3 finally implements recursive memory protection semantics.

Because of a risk of regressing legacy setups, the new semantics are
hidden behind a cgroup2 mount option, 'memory_recursiveprot'.

More details in patch #3.

 Documentation/admin-guide/cgroup-v2.rst |  11 ++
 include/linux/cgroup-defs.h             |   5 +
 kernel/cgroup/cgroup.c                  |  17 ++-
 mm/memcontrol.c                         | 241 +++++++++++++++++++-----------
 mm/page_counter.c                       |  12 +-
 5 files changed, 190 insertions(+), 96 deletions(-)



^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH 0/3] mm: memcontrol: recursive memory.low protection
@ 2020-02-27 19:56 Johannes Weiner
  2020-02-27 19:56 ` [PATCH 3/3] " Johannes Weiner
  0 siblings, 1 reply; 9+ messages in thread
From: Johannes Weiner @ 2020-02-27 19:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Roman Gushchin, Michal Hocko, Tejun Heo, Chris Down,
	Michal Koutný,
	linux-mm, cgroups, linux-kernel, kernel-team

Changes since v2:
- Changelog & documentation updates (Michal Hocko, Michal Koutny)

Changes since v1:
- improved Changelogs based on the discussion with Roman. Thanks!
- fix div0 when recursive & fixed protection is combined
- fix an unused compiler warning

The current memory.low (and memory.min) semantics require protection
to be assigned to a cgroup in an untinterrupted chain from the
top-level cgroup all the way to the leaf.

In practice, we want to protect entire cgroup subtrees from each other
(system management software vs. workload), but we would like the VM to
balance memory optimally *within* each subtree, without having to make
explicit weight allocations among individual components. The current
semantics make that impossible.

They also introduce unmanageable complexity into more advanced
resource trees. For example:

          host root
          `- system.slice
             `- rpm upgrades
             `- logging
          `- workload.slice
             `- a container
                `- system.slice
                `- workload.slice
                   `- job A
                      `- component 1
                      `- component 2
                   `- job B

From a host-level perspective, we would like to protect the outer
workload.slice subtree as a whole from rpm upgrades, logging etc. But
for that to be effective, right now we'd have to propagate it down
through the container, the inner workload.slice, into the job cgroup
and ultimately the component cgroups where memory is actually,
physically allocated. This may cross several tree delegation points
and namespace boundaries, which make such a setup near impossible.

CPU and IO on the other hand are already distributed recursively. The
user would simply configure allowances at the host level, and they
would apply to the entire subtree without any downward propagation.

To enable the above-mentioned usecases and bring memory in line with
other resource controllers, this patch series extends memory.low/min
such that settings apply recursively to the entire subtree. Users can
still assign explicit shares in subgroups, but if they don't, any
ancestral protection will be distributed such that children compete
freely amongst each other - as if no memory control were enabled
inside the subtree - but enjoy protection from neighboring trees.

In the above example, the user would then be able to configure shares
of CPU, IO and memory at the host level to comprehensively protect and
isolate the workload.slice as a whole from system.slice activity.

Patch #1 fixes an existing bug that can give a cgroup tree more
protection than it should receive as per ancestor configuration.

Patch #2 simplifies and documents the existing code to make it easier
to reason about the changes in the next patch.

Patch #3 finally implements recursive memory protection semantics.

Because of a risk of regressing legacy setups, the new semantics are
hidden behind a cgroup2 mount option, 'memory_recursiveprot'.

More details in patch #3.

 Documentation/admin-guide/cgroup-v2.rst |  11 ++
 include/linux/cgroup-defs.h             |   5 +
 kernel/cgroup/cgroup.c                  |  17 ++-
 mm/memcontrol.c                         | 220 +++++++++++++++++-------------
 mm/page_counter.c                       |  12 +-
 5 files changed, 160 insertions(+), 105 deletions(-)



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-02-27 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-13 19:21 [PATCH 0/3] mm: memcontrol: recursive memory protection Johannes Weiner
2019-12-13 19:21 ` [PATCH 1/3] mm: memcontrol: fix memory.low proportional distribution Johannes Weiner
2019-12-13 20:40   ` Roman Gushchin
2019-12-16 18:25     ` Johannes Weiner
2019-12-16 19:11       ` Roman Gushchin
2019-12-13 19:21 ` [PATCH 2/3] mm: memcontrol: clean up and document effective low/min calculations Johannes Weiner
2019-12-13 19:21 ` [PATCH 3/3] mm: memcontrol: recursive memory.low protection Johannes Weiner
2019-12-13 20:05   ` Johannes Weiner
2020-02-27 19:56 [PATCH 0/3] " Johannes Weiner
2020-02-27 19:56 ` [PATCH 3/3] " Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).