From: Johannes Weiner <email@example.com> To: Michal Hocko <firstname.lastname@example.org> Cc: Tejun Heo <email@example.com>, Shakeel Butt <firstname.lastname@example.org>, Jakub Kicinski <email@example.com>, Andrew Morton <firstname.lastname@example.org>, Linux MM <email@example.com>, Kernel Team <firstname.lastname@example.org>, Chris Down <email@example.com>, Cgroups <firstname.lastname@example.org> Subject: Re: [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted Date: Tue, 21 Apr 2020 12:56:01 -0400 [thread overview] Message-ID: <20200421165601.GA345998@cmpxchg.org> (raw) In-Reply-To: <20200421161138.GL27314@dhcp22.suse.cz> On Tue, Apr 21, 2020 at 06:11:38PM +0200, Michal Hocko wrote: > On Tue 21-04-20 10:27:46, Johannes Weiner wrote: > > On Tue, Apr 21, 2020 at 01:06:12PM +0200, Michal Hocko wrote: > [...] > > > I am also not sure about the isolation aspect. Because an external > > > memory pressure might have pushed out memory to the swap and then the > > > workload is throttled based on an external event. Compare that to the > > > memory.high throttling which is not directly affected by the external > > > pressure. > > > > Neither memory.high nor swap.high isolate from external pressure. > > I didn't say they do. What I am saying is that an external pressure > might punish swap.high memcg because the external memory pressure would > eat up the quota and trigger the throttling. External pressure could also push a cgroup into a swap device that happens to be very slow and cause the cgroup to be throttled that way. But that effect is actually not undesirable. External pressure means that something more important runs and needs the memory of something less important (otherwise, memory.low would deflect this intrusion). So we're punishing/deprioritizing the right cgroup here. The one that isn't protected from memory pressure. > It is fair to say that this externally triggered interference is already > possible with swap.max as well though. It would likely be just more > verbose because of the oom killer intervention rather than a slowdown. Right. > > They > > are put on cgroups so they don't cause pressure on other cgroups. Swap > > is required when either your footprint grows or your available space > > shrinks. That's why it behaves like that. > > > > That being said, I think we're getting lost in the implementation > > details before we have established what the purpose of this all > > is. Let's talk about this first. > > Thanks for describing it in the length. I have a better picture of the > intention (this should have been in the changelog ideally). I can see > how the swap consumption throttling might be useful but I still dislike the > proposed implementation. Mostly because of throttling of all allocations > regardless whether they can contribute to the swap consumption or not. I mean, even if they're not swappable, they can still contribute to swap consumption that wouldn't otherwise have been there. Each new page that comes in displaces another page at the end of the big LRU pipeline and pushes it into the mouth of reclaim - which may swap. So *every* allocation has a certain probability of increasing swap usage. The fact that we have reached swap.high is a good hint that reclaim has indeed been swapping quite aggressively to accomodate incoming allocations, and probably will continue to do so. We could check whether there are NO anon pages left in a workload, but that's such an extreme and short-lived case that it probably wouldn't make a difference in practice. We could try to come up with a model that calculates a probabilty of each new allocation to cause swap. Whether that new allocation itself is swapbacked would of course be a factor, but there are other factors as well: the millions of existing LRU pages, the reclaim decisions we will make, swappiness and so forth. Of course, I agree with you, if all you have coming in is cache allocations, you'd *eventually* run out of pages to swap. However, 10G of new active cache allocations can still cause 10G of already allocated anon pages to get swapped out. For example if a malloc() leak happened *before* the regular cache workingset is established. We cannot retro-actively throttle those anon pages, we can only keep new allocations from pushing old ones into swap.
next prev parent reply other threads:[~2020-04-21 16:56 UTC|newest] Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-04-17 1:06 Jakub Kicinski 2020-04-17 1:06 ` [PATCH 1/3] mm: prepare for swap over-high accounting and penalty calculation Jakub Kicinski 2020-04-17 1:06 ` [PATCH 2/3] mm: move penalty delay clamping out of calculate_high_delay() Jakub Kicinski 2020-04-17 1:06 ` [PATCH 3/3] mm: automatically penalize tasks with high swap use Jakub Kicinski 2020-04-17 7:37 ` Michal Hocko 2020-04-17 23:22 ` Jakub Kicinski 2020-04-17 16:11 ` [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted Shakeel Butt 2020-04-17 16:23 ` Tejun Heo 2020-04-17 17:18 ` Shakeel Butt 2020-04-17 17:36 ` Tejun Heo 2020-04-17 17:51 ` Shakeel Butt 2020-04-17 19:35 ` Tejun Heo 2020-04-17 21:51 ` Shakeel Butt 2020-04-17 22:59 ` Tejun Heo 2020-04-20 16:12 ` Shakeel Butt 2020-04-20 16:47 ` Tejun Heo 2020-04-20 17:03 ` Michal Hocko 2020-04-20 17:06 ` Tejun Heo 2020-04-21 11:06 ` Michal Hocko 2020-04-21 14:27 ` Johannes Weiner 2020-04-21 16:11 ` Michal Hocko 2020-04-21 16:56 ` Johannes Weiner [this message] 2020-04-22 13:26 ` Michal Hocko 2020-04-22 14:15 ` Johannes Weiner 2020-04-22 15:43 ` Michal Hocko 2020-04-22 17:13 ` Johannes Weiner 2020-04-22 18:49 ` Michal Hocko 2020-04-23 15:00 ` Johannes Weiner 2020-04-24 15:05 ` Michal Hocko 2020-04-28 14:24 ` Johannes Weiner 2020-04-29 9:55 ` Michal Hocko 2020-04-21 19:09 ` Shakeel Butt 2020-04-21 21:59 ` Johannes Weiner 2020-04-21 22:39 ` Shakeel Butt 2020-04-21 15:20 ` Tejun Heo
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200421165601.GA345998@cmpxchg.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).