From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=X7gQ=6B=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,
	SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 013F0C2D0EF
	for <linux-mm@archiver.kernel.org>; Fri, 17 Apr 2020 17:51:25 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id BD5DD20780
	for <linux-mm@archiver.kernel.org>; Fri, 17 Apr 2020 17:51:24 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rcPENmuw"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BD5DD20780
Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 4A0888E003F; Fri, 17 Apr 2020 13:51:24 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 4510A8E0023; Fri, 17 Apr 2020 13:51:24 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 3669B8E003F; Fri, 17 Apr 2020 13:51:24 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0033.hostedemail.com [216.40.44.33])
	by kanga.kvack.org (Postfix) with ESMTP id 1EE798E0023
	for <linux-mm@kvack.org>; Fri, 17 Apr 2020 13:51:24 -0400 (EDT)
Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay04.hostedemail.com (Postfix) with ESMTP id D86B04DDF
	for <linux-mm@kvack.org>; Fri, 17 Apr 2020 17:51:23 +0000 (UTC)
X-FDA: 76718088846.02.tramp36_80936cf6a3f25
X-HE-Tag: tramp36_80936cf6a3f25
X-Filterd-Recvd-Size: 7169
Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195])
	by imf01.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Fri, 17 Apr 2020 17:51:23 +0000 (UTC)
Received: by mail-lj1-f195.google.com with SMTP id m8so2947313lji.1
        for <linux-mm@kvack.org>; Fri, 17 Apr 2020 10:51:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=0bvq8GDDyYV41j+5P8OsI6ehsGT+D4XvqiNl9KWPqiw=;
        b=rcPENmuw3uoibqCQ1bei91SZIX5CUxsppMlNgftSQMML52eG74i79y68GqmTEcLfP0
         ULN6ukv+0a58WwyLrgvPd2vyg1NgsO6mTXw02TFLShqH7rTMyLxmJj0pas6ibkeULTuF
         zvL7GDe8Nsjj2H6xwJKjHk1scm/yLizWICpyBkHsDskJnl4p0tKyqkLm3QTxkg8uD3DB
         5WI68ZUgtj++O69DQfmKAb72c7UIxoFYL2VMHIvnMSHyFT3fHtrXZMsF3+Uc1gNI9ySG
         7VYION5mZD2kA76B0mIi1P5r48iacIP98VQkdw0A9e8+QLgsfdukRlPBYqeo5N9elsJs
         2LzA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=0bvq8GDDyYV41j+5P8OsI6ehsGT+D4XvqiNl9KWPqiw=;
        b=L+c3GVdKFia+EhFnb8hKKtmn9oMI4ozLeeUh4438gSBSE8XYmnVbfDp9XwiGCxNpmY
         1hblwH3IeRB2xbAd2S/0JkL1r01pSs+nfiCRJkz+y2UFx/os7HbDDbJWkxiXNLu3A+q5
         f2RQNjw3MM4glgbNyiTZfEq+/pUsi4QtLiS1F/AHFupeNYt0KTuE+edhWqNVy/U6cgvM
         CMF4l/ThWeW7MlnkyEFLCPBWo6v+vrezto9Cjab9GJtGNzsDtW5aj2nEj+QqfuGE2R33
         dhW7Vjl+ZmPzDK29djTA0nit5NOO8fYz9KWHe417MhDW9UoFhef6YMSPcEeRAjNxIyMe
         Qgug==
X-Gm-Message-State: AGi0PuYBW0+2W1oekkVJ+jUlz6rsITzPLWapOUwODflHSW81jaTSvM3P
	i9MOvQgZE8dDsfgKEH6NSvtRlDYO0MD6Klb0nyyKtg==
X-Google-Smtp-Source: APiQypJFNJn4Za/tJEY0/xUDDM4+G2XMSoYiOKLmeeNO9+iJSvrfOrft0DSH72B8yscUZkbsIPgbTRzs/qAQOub3CS8=
X-Received: by 2002:a2e:9a4a:: with SMTP id k10mr2880113ljj.115.1587145881775;
 Fri, 17 Apr 2020 10:51:21 -0700 (PDT)
MIME-Version: 1.0
References: <20200417010617.927266-1-kuba@kernel.org> <CALvZod78ZUhU+yr2x1h_gv+VgVGTPnSSGKh_+fd+MeiAKreJvg@mail.gmail.com>
 <20200417162355.GA43469@mtj.thefacebook.com> <CALvZod4ftvXCu8SbQUXwTGVvx5K2+at9h30r28chZLXEB1JdfQ@mail.gmail.com>
 <20200417173615.GB43469@mtj.thefacebook.com>
In-Reply-To: <20200417173615.GB43469@mtj.thefacebook.com>
From: Shakeel Butt <shakeelb@google.com>
Date: Fri, 17 Apr 2020 10:51:10 -0700
Message-ID: <CALvZod7-r0OrJ+-_uCy_p3BU3348ve2+YatiSdLvFaVqcqCs=w@mail.gmail.com>
Subject: Re: [PATCH 0/3] memcg: Slow down swap allocation as the available
 space gets depleted
To: Tejun Heo <tj@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>, Andrew Morton <akpm@linux-foundation.org>, 
	Linux MM <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>, 
	Johannes Weiner <hannes@cmpxchg.org>, Chris Down <chris@chrisdown.name>, 
	Cgroups <cgroups@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Fri, Apr 17, 2020 at 10:36 AM Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> On Fri, Apr 17, 2020 at 10:18:15AM -0700, Shakeel Butt wrote:
> > > There currently are issues with anonymous memory management which makes them
> > > different / worse than page cache but I don't follow why swapping
> > > necessarily means that isolation is broken. Page refaults don't indicate
> > > that memory isolation is broken after all.
> >
> > Sorry, I meant the performance isolation. Direct reclaim does not
> > really differentiate who to stall and whose CPU to use.
>
> Can you please elaborate concrete scenarios? I'm having a hard time seeing
> differences from page cache.
>

Oh I was talking about the global reclaim here. In global reclaim, any
task can be throttled (throttle_direct_reclaim()). Memory freed by
using the CPU of high priority low latency jobs can be stolen by low
priority batch jobs.

> > > > memcg limit reclaim and memcg limits are overcommitted. Shouldn't
> > > > running out of swap will trigger the OOM earlier which should be
> > > > better than impacting the whole system.
> > >
> > > The primary scenario which was being considered was undercommitted
> > > protections but I don't think that makes any relevant differences.
> > >
> >
> > What is undercommitted protections? Does it mean there is still swap
> > available on the system but the memcg is hitting its swap limit?
>
> Hahaha, I assumed you were talking about memory.high/max and was saying that
> the primary scenarios that were being considered was usage of memory.low
> interacting with swap. Again, can you please give an concrete example so
> that we don't misunderstand each other?
>
> > > This is exactly similar to delay injection for memory.high. What's desired
> > > is slowing down the workload as the available resource is depleted so that
> > > the resource shortage presents as gradual degradation of performance and
> > > matching increase in resource PSI. This allows the situation to be detected
> > > and handled from userland while avoiding sudden and unpredictable behavior
> > > changes.
> > >
> >
> > Let me try to understand this with an example. Memcg 'A' has
>
> Ah, you already went there. Great.
>
> > memory.high = 100 MiB, memory.max = 150 MiB and memory.swap.max = 50
> > MiB. When A's usage goes over 100 MiB, it will reclaim the anon, file
> > and kmem. The anon will go to swap and increase its swap usage until
> > it hits the limit. Now the 'A' reclaim_high has fewer things (file &
> > kmem) to reclaim but the mem_cgroup_handle_over_high() will keep A's
> > increase in usage in check.
> >
> > So, my question is: should the slowdown by memory.high depends on the
> > reclaimable memory? If there is no reclaimable memory and the job hits
> > memory.high, should the kernel slow it down to crawl until the PSI
> > monitor comes and decides what to do. If I understand correctly, the
> > problem is the kernel slow down is not successful when reclaimable
> > memory is very low. Please correct me if I am wrong.
>
> In combination with memory.high, swap slowdown may not be necessary because
> memory.high's slow down mechanism is already there to handle "can't swap"
> scenario whether that's because swap is disabled wholesale, limited or
> depleted. However, please consider the following scenario.
>
> cgroup A has memory.low protection and no other restrictions. cgroup B has
> no protection and has access to swap. When B's memory starts bloating and
> gets the system under memory contention, it'll start consuming swap until it
> can't. When swap becomes depleted for B, there's nothing holding it back and
> B will start eating into A's protection.
>

In this example does 'B' have memory.high and memory.max set and by A
having no other restrictions, I am assuming you meant unlimited high
and max for A? Can 'A' use memory.min?

> The proposed mechanism just plugs another vector for the same condition
> where anonymous memory management breaks down because they can no longer be
> reclaimed due to swap unavailability.
>
> Thanks.
>
> --
> tejun