linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Michal Koutný" <mkoutny@suse.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@suse.com>,
	Tejun Heo <tj@kernel.org>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection
Date: Wed, 26 Feb 2020 14:22:37 +0100	[thread overview]
Message-ID: <20200226132237.GA16746@blackbody.suse.cz> (raw)
In-Reply-To: <20200225150304.GA10257@cmpxchg.org>

Hello
and thanks for continuing the debate.

On Tue, Feb 25, 2020 at 10:03:04AM -0500, Johannes Weiner <hannes@cmpxchg.org> wrote:
> By origin protection I mean protection [...]
> If global reclaim occurs, [...]
> However, if limit reclaim in A occurs due to the 12G max limit [...]
My previous thinking was too bound to the absolute/global POV. Hence,
effectiveness of memory.low is relative to reclaim origin:
- full value -- protection against siblings (i.e. limit in parent).
- reduced value (share) -- protection against (great)-uncles (i.e. limit in
  (great)-grandparent).

(And depending on the absolute depth, it means respective protection
against global reclaim too.)

I see this didn't change since the original implementation w/out
effective protections. So it was just me being confused that protection
can be overcommited locally (but not globally/at higher level, so it's
consistent).

> That hinges on whether an opt-out mechanism makes sense, and we
> disagree on that part.
After my correction above, the calculation I had proposed would reduce
protection unnecessarily for reclaims triggered by nearby limits.

> > Simplest approach would be likely to introduce the special "inherit"
> > value (such a literal name may be misleading as it would be also
> > "dont-care").
> 
> Again, a complication of the interface for *everybody* 
Not if the special value is the new default (alas, that would still need
the mount option).


> Can you explain why you think protection is different from a weight?
- weights are dimension-less, they represent no real resource
- sum of sibling weights is meaningless (and independent from parent
  weight)
- to me this protection is closer to limits (actually I like your simile
  that they're very lazily enforced limits)

> Both specify a minimum amount of a resource that the cgroup can use
> under contention, while allowing the cgroup to use more than that
> share if there is no contention with siblings.
>
> You configure memory in bytes instead of a relative proportion, but
> that's only because bytes are a natural unit of memory whereas a
> relative proportion of time is a natural unit of CPU and IO.
Weights specify ratio (between siblings), not the amount. Single weight
is meaningless, (the meaningful proportion would be the fraction from
cpu.max, i.e. relative to absolute resource).

With weights, non-competing siblings drop out of denominator,
with protection, non-competing siblings (in the sense of not consuming
their allowance) may add resource back to the pool (given by parent).

> For example, if you assign a share of CPU or IO to a subtree, that
> applies to the entire subtree. Nobody has proposed being able to
> opt-out of shares in a subtree, let alone forcing individual cgroups
> to *opt-in* to receive these shares.
The former is because it makes no sense to deny all CPU/IO, the latter
consequence of that too.

> Now you apply memory pressure, what happens?. D isn't reclaimed, C is
> somewhat reclaimed, E is reclaimed hard. D will not page, C will page
> a little bit, E will page hard *with the higher IO priority of B*.
> 
> Now C is stuck behind E. This is a priority inversion.
This is how I understand the weights to work.

    A				
    `- B		io.weight=200
       `- D		io.weight=100 (e.g.)
       `- E		io.weight=100 (e.g.)
    `- C		io.weight=50

Whatever weights I assign to D and E, when only E and C compete, E will
have higher weight (200 to 50, work-conservacy of weights).

I don't think this inversion is wrong because E's work is still on
behalf of B.

Or did you mean that if protections were transformed (via effective
calculation) to have ratios only in the same range as io.weights
(1e-4..1e4 instead of 0..inf), then it'd prevent the inversion? (By
setting D,E weights in same ratios as D,E protections.)

> 1. Can you please make a practical use case for having scape goats or
>    donor groups to justify retaining what I consider to be an
>    unimportant artifact in the memory.low semantics?
    A.low=10G
    `- B.low=X   u=6G
    `- C.low=X   u=4G
    `- D.low=0G  u=5G

B,C   run the workload which should be protected
D     runs job that doesn't need any protection 
u     denotes usage
(I made the example with more than one important sibling to illustrate
usefulness of some implicit distribution X.)

When outer reclaim comes, reclaiming from B,C would be detrimental to
their performance, while impact on D is unimportant. (And induced IO
load on the rest (out of A) too.)

It's not possible to move D to the A's level, since only A is all what a
given user can control.

> 2. If you think opting out of hierarchically assigned resources is a
>    fundamentally important usecase, can you please either make an
>    argument why it should also apply to CPU and IO, or alternatively
>    explain in detail why they are meaningfully different?
I'd say that protected memory is a disposable resource in contrast with
CPU/IO. If you don't have latter, you don't progress; if you lack the
former, you are refaulting but can make progress. Even more, you should
be able to give up memory.min.

Michal


  reply	other threads:[~2020-02-26 13:22 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-19 20:07 [PATCH v2 0/3] mm: memcontrol: recursive memory protection Johannes Weiner
2019-12-19 20:07 ` [PATCH v2 1/3] mm: memcontrol: fix memory.low proportional distribution Johannes Weiner
2020-01-30 11:49   ` Michal Hocko
2020-02-03 21:21     ` Johannes Weiner
2020-02-03 21:38       ` Roman Gushchin
2019-12-19 20:07 ` [PATCH v2 2/3] mm: memcontrol: clean up and document effective low/min calculations Johannes Weiner
2020-01-30 12:54   ` Michal Hocko
2020-02-21 17:10   ` Michal Koutný
2020-02-25 18:40     ` Johannes Weiner
2020-02-26 16:46       ` Michal Koutný
2020-02-26 19:40         ` Johannes Weiner
2019-12-19 20:07 ` [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection Johannes Weiner
2020-01-30 17:00   ` Michal Hocko
2020-02-03 21:52     ` Johannes Weiner
2020-02-10 15:21       ` Johannes Weiner
2020-02-11 16:47       ` Michal Hocko
2020-02-12 17:08         ` Johannes Weiner
2020-02-13  7:40           ` Michal Hocko
2020-02-13 13:23             ` Johannes Weiner
2020-02-13 15:46               ` Michal Hocko
2020-02-13 17:41                 ` Johannes Weiner
2020-02-13 17:58                   ` Johannes Weiner
2020-02-14  7:59                     ` Michal Hocko
2020-02-13 13:53             ` Tejun Heo
2020-02-13 15:47               ` Michal Hocko
2020-02-13 15:52                 ` Tejun Heo
2020-02-13 16:36                   ` Michal Hocko
2020-02-13 16:57                     ` Tejun Heo
2020-02-14  7:15                       ` Michal Hocko
2020-02-14 13:57                         ` Tejun Heo
2020-02-14 15:13                           ` Michal Hocko
2020-02-14 15:40                             ` Tejun Heo
2020-02-14 16:53                             ` Johannes Weiner
2020-02-14 17:17                               ` Tejun Heo
2020-02-17  8:41                               ` Michal Hocko
2020-02-18 19:52                                 ` Johannes Weiner
2020-02-21 10:11                                   ` Michal Hocko
2020-02-21 15:43                                     ` Johannes Weiner
2020-02-25 12:20                                       ` Michal Hocko
2020-02-25 18:17                                         ` Johannes Weiner
2020-02-26 17:56                                           ` Michal Hocko
2020-02-21 17:12   ` Michal Koutný
2020-02-21 18:58     ` Johannes Weiner
2020-02-25 13:37       ` Michal Koutný
2020-02-25 15:03         ` Johannes Weiner
2020-02-26 13:22           ` Michal Koutný [this message]
2020-02-26 15:05             ` Johannes Weiner
2020-02-27 13:35               ` Michal Koutný
2020-02-27 15:06                 ` Johannes Weiner
2019-12-19 20:22 ` [PATCH v2 0/3] mm: memcontrol: recursive memory protection Tejun Heo
2019-12-20  4:06 ` Roman Gushchin
2019-12-20  4:29 ` Chris Down

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200226132237.GA16746@blackbody.suse.cz \
    --to=mkoutny@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).