From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=W4zG=K5=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-2.7 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,
	HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE autolearn=no
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 5D22BC47096
	for <linux-mm@archiver.kernel.org>; Thu,  3 Jun 2021 10:19:32 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id 0424C6139A
	for <linux-mm@archiver.kernel.org>; Thu,  3 Jun 2021 10:19:31 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0424C6139A
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 8D8266B0074; Thu,  3 Jun 2021 06:19:31 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 888336B0075; Thu,  3 Jun 2021 06:19:31 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 728CF6B0078; Thu,  3 Jun 2021 06:19:31 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222])
	by kanga.kvack.org (Postfix) with ESMTP id 4345A6B0074
	for <linux-mm@kvack.org>; Thu,  3 Jun 2021 06:19:31 -0400 (EDT)
Received: from smtpin33.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay04.hostedemail.com (Postfix) with ESMTP id C8A31EFBE
	for <linux-mm@kvack.org>; Thu,  3 Jun 2021 10:19:30 +0000 (UTC)
X-FDA: 78212015700.33.ECC0D36
Received: from mail-oo1-f48.google.com (mail-oo1-f48.google.com [209.85.161.48])
	by imf17.hostedemail.com (Postfix) with ESMTP id A5EDA4202A27
	for <linux-mm@kvack.org>; Thu,  3 Jun 2021 10:19:20 +0000 (UTC)
Received: by mail-oo1-f48.google.com with SMTP id v13-20020a4aa40d0000b02902052145a469so1273744ool.3
        for <linux-mm@kvack.org>; Thu, 03 Jun 2021 03:19:30 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=6RNWFHjnK95WvTFB33SdSxX9jSbolNBF1nLUGKTdCNQ=;
        b=jF+aXbEyX05XO3P7/FYCOozuLw09MHxAlxMp9y0xUT5WS5Sl5zBKk6ndg8Dzvmwxn4
         0R0MrBJrEkfpDnnEMEUCH66SjMuvRwBvxCI62/0C19KDtXmexOMCXpS5cDpbT90O9jep
         87qA24eUczfRMT6kiKYDCFb39Kw29iLJzn8k5O36SR4j/NQq0ODZlwj27jWfyqqldqz2
         hNGkhrLTf7bplojEx5pTL+HDi6NZsJcJ+WMCX5LSLz/osvmuGaZ4+/eOQKqvDuURYX36
         f0iSR6wNaH62+K8S4DpimTW/uDPObdY+Q2mxWIq5JN9taHqJcnl8uqJjj364DLaE47FS
         IWaQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=6RNWFHjnK95WvTFB33SdSxX9jSbolNBF1nLUGKTdCNQ=;
        b=U4zf2ZD1y4HUB2spCf9eQZ6nQYX+/2NdQAvX4BQ/UooPD+4aQqaEligBPkwKOa3KJM
         BcmLBVnAI6c6aupsy0+FmfD3BXIKz2Dlp/o73Za1PyV7oAYsmHlZWkHNTcO6HdKJFR86
         B4a/rH3ByZ+H9GBhuJYzWsi8srmfp88wYFEkY68qOCpCxfTzCAN81wci1jrmr2AsSjeM
         HAp4uqq80lJtGrix8tEkbYDFsLS2oR8F8RZDmZoKI7e+3Hvj+bCfYdOXs+RcKA3JN5Pd
         4yWC6BoJomRYBFVOsQ/j1VqIfGxLmycooxSOFn2J72XE1Nd85nY9HXOOh2gRVTbZHzG3
         peDQ==
X-Gm-Message-State: AOAM533X27kJiZYpDsJWhlTwf7bpRHyEIkhkK8s5+2v8E7oU2e72ydMZ
	DoSR/ewQeRAGelUEKqTj6n3QhR/PPwJMdFkFnjE=
X-Google-Smtp-Source: ABdhPJwN/tOwGpC5wSWtYlXgOnnJkF2W9Km3j1z8KbXFDVx6wWD8nLUiyptTx2t52cwoYrEJvMLxc4wNaXiEXI540bY=
X-Received: by 2002:a4a:d4c7:: with SMTP id r7mr28522472oos.85.1622715569622;
 Thu, 03 Jun 2021 03:19:29 -0700 (PDT)
MIME-Version: 1.0
References: <cover.1622043596.git.yuleixzhang@tencent.com> <CALvZod4SoCS6ym8ELTxWd6UwzUp8m_UUdw7oApAhW2WRq0BXqw@mail.gmail.com>
 <CACZOiM3VhYyzCTx4FbW=FF8WB=X46xaV53abqOVL+eHQOs8Reg@mail.gmail.com>
 <YLZIBpJFkKNBCg2X@chrisdown.name> <CACZOiM21STLrZgcnEwm8w2t82Qj3Ohy-BGbD5u62gTn=z4X3Lw@mail.gmail.com>
 <CALvZod7w1tzxvYCP54KHEo=k=qUd02UTkr+1+b5rTdn-tJt45w@mail.gmail.com>
In-Reply-To: <CALvZod7w1tzxvYCP54KHEo=k=qUd02UTkr+1+b5rTdn-tJt45w@mail.gmail.com>
From: yulei zhang <yulei.kernel@gmail.com>
Date: Thu, 3 Jun 2021 18:19:18 +0800
Message-ID: <CACZOiM3g6GhJgXurMPeE3A7zO8eUhoUPyUvyT3p2Kw98WkX8+g@mail.gmail.com>
Subject: Re: [RFC 0/7] Introduce memory allocation speed throttle in memcg
To: Shakeel Butt <shakeelb@google.com>
Cc: Chris Down <chris@chrisdown.name>, Tejun Heo <tj@kernel.org>, 
	Zefan Li <lizefan.x@bytedance.com>, Johannes Weiner <hannes@cmpxchg.org>, 
	Christian Brauner <christian@brauner.io>, Cgroups <cgroups@vger.kernel.org>, benbjiang@tencent.com, 
	Wanpeng Li <kernellwp@gmail.com>, Yulei Zhang <yuleixzhang@tencent.com>, 
	Linux MM <linux-mm@kvack.org>, Michal Hocko <mhocko@kernel.org>, Roman Gushchin <guro@fb.com>
Content-Type: text/plain; charset="UTF-8"
Authentication-Results: imf17.hostedemail.com;
	dkim=pass header.d=gmail.com header.s=20161025 header.b=jF+aXbEy;
	dmarc=pass (policy=none) header.from=gmail.com;
	spf=pass (imf17.hostedemail.com: domain of yuleikernel@gmail.com designates 209.85.161.48 as permitted sender) smtp.mailfrom=yuleikernel@gmail.com
X-Rspamd-Server: rspam05
X-Rspamd-Queue-Id: A5EDA4202A27
X-Stat-Signature: kszpquoc47garso5pnnxdnoeusi3zxmp
X-HE-Tag: 1622715560-64613
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Wed, Jun 2, 2021 at 11:39 PM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Wed, Jun 2, 2021 at 2:11 AM yulei zhang <yulei.kernel@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 10:45 PM Chris Down <chris@chrisdown.name> wrote:
> > >
> > > yulei zhang writes:
> > > >Yep, dynamically adjust the memory.high limits can ease the memory pressure
> > > >and postpone the global reclaim, but it can easily trigger the oom in
> > > >the cgroups,
> > >
> > > To go further on Shakeel's point, which I agree with, memory.high should
> > > _never_ result in memcg OOM. Even if the limit is breached dramatically, we
> > > don't OOM the cgroup. If you have a demonstration of memory.high resulting in
> > > cgroup-level OOM kills in recent kernels, then that needs to be provided. :-)
> >
> > You are right, I mistook it for max. Shakeel means the throttling
> > during context switch
> > which uses memory.high as threshold to calculate the sleep time.
> > Currently it only applies
> > to cgroupv2.  In this patchset we explore another idea to throttle the
> > memory usage, which
> > rely on setting an average allocation speed in memcg. We hope to
> > suppress the memory
> > usage in low priority cgroups when it reaches the system watermark and
> > still keep the activities
> > alive.
>
> I think you need to make the case: why should we add one more form of
> throttling? Basically why memory.high is not good for your use-case
> and the proposed solution works better. Though IMO it would be a hard
> sell.

Thanks. IMHO, there are differences between these two throttlings.
memory.high is a per-memcg throttle which targets to limit the memory
usage of the tasks in the cgroup. For the memory allocation speed throttle(MST),
the purpose is to avoid the memory burst in cgroup which would trigger
the global reclaim and affects the timing sensitive workloads in other cgroup.
For example, we have two pods with memory overcommit enabled, one includes
online tasks and the other has offline tasks, if we restrict the memory usage of
the offline pod with memory.high, it will lose the benefit of memory overcommit
when the other workloads are idle. On the other hand, if we don't
limit the memory
usage, it will easily break the system watermark when there suddenly has massive
memory operations. If enable MST in this case, we will be able to
avoid the direct
reclaim and leverage the overcommit.
.


From mboxrd@z Thu Jan  1 00:00:00 1970
From: yulei zhang <yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC 0/7] Introduce memory allocation speed throttle in memcg
Date: Thu, 3 Jun 2021 18:19:18 +0800
Message-ID: <CACZOiM3g6GhJgXurMPeE3A7zO8eUhoUPyUvyT3p2Kw98WkX8+g@mail.gmail.com>
References: <cover.1622043596.git.yuleixzhang@tencent.com> <CALvZod4SoCS6ym8ELTxWd6UwzUp8m_UUdw7oApAhW2WRq0BXqw@mail.gmail.com>
 <CACZOiM3VhYyzCTx4FbW=FF8WB=X46xaV53abqOVL+eHQOs8Reg@mail.gmail.com>
 <YLZIBpJFkKNBCg2X@chrisdown.name> <CACZOiM21STLrZgcnEwm8w2t82Qj3Ohy-BGbD5u62gTn=z4X3Lw@mail.gmail.com>
 <CALvZod7w1tzxvYCP54KHEo=k=qUd02UTkr+1+b5rTdn-tJt45w@mail.gmail.com>
Mime-Version: 1.0
Return-path: <cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=6RNWFHjnK95WvTFB33SdSxX9jSbolNBF1nLUGKTdCNQ=;
        b=jF+aXbEyX05XO3P7/FYCOozuLw09MHxAlxMp9y0xUT5WS5Sl5zBKk6ndg8Dzvmwxn4
         0R0MrBJrEkfpDnnEMEUCH66SjMuvRwBvxCI62/0C19KDtXmexOMCXpS5cDpbT90O9jep
         87qA24eUczfRMT6kiKYDCFb39Kw29iLJzn8k5O36SR4j/NQq0ODZlwj27jWfyqqldqz2
         hNGkhrLTf7bplojEx5pTL+HDi6NZsJcJ+WMCX5LSLz/osvmuGaZ4+/eOQKqvDuURYX36
         f0iSR6wNaH62+K8S4DpimTW/uDPObdY+Q2mxWIq5JN9taHqJcnl8uqJjj364DLaE47FS
         IWaQ==
In-Reply-To: <CALvZod7w1tzxvYCP54KHEo=k=qUd02UTkr+1+b5rTdn-tJt45w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
List-ID: <cgroups.vger.kernel.org>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Chris Down <chris-6Bi1550iOqEnzZ6mRAm98g@public.gmane.org>, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>, Zefan Li <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>, Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>, Christian Brauner <christian-STijNZzMWpgWenYVfaLwtA@public.gmane.org>, Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, benbjiang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org, Wanpeng Li <kernellwp-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>, Yulei Zhang <yuleixzhang-1Nz4purKYjRBDgjK7y7TUQ@public.gmane.org>, Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>, Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>, Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>

On Wed, Jun 2, 2021 at 11:39 PM Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>
> On Wed, Jun 2, 2021 at 2:11 AM yulei zhang <yulei.kernel-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > On Tue, Jun 1, 2021 at 10:45 PM Chris Down <chris-6Bi1550iOqEnzZ6mRAm98g@public.gmane.org> wrote:
> > >
> > > yulei zhang writes:
> > > >Yep, dynamically adjust the memory.high limits can ease the memory pressure
> > > >and postpone the global reclaim, but it can easily trigger the oom in
> > > >the cgroups,
> > >
> > > To go further on Shakeel's point, which I agree with, memory.high should
> > > _never_ result in memcg OOM. Even if the limit is breached dramatically, we
> > > don't OOM the cgroup. If you have a demonstration of memory.high resulting in
> > > cgroup-level OOM kills in recent kernels, then that needs to be provided. :-)
> >
> > You are right, I mistook it for max. Shakeel means the throttling
> > during context switch
> > which uses memory.high as threshold to calculate the sleep time.
> > Currently it only applies
> > to cgroupv2.  In this patchset we explore another idea to throttle the
> > memory usage, which
> > rely on setting an average allocation speed in memcg. We hope to
> > suppress the memory
> > usage in low priority cgroups when it reaches the system watermark and
> > still keep the activities
> > alive.
>
> I think you need to make the case: why should we add one more form of
> throttling? Basically why memory.high is not good for your use-case
> and the proposed solution works better. Though IMO it would be a hard
> sell.

Thanks. IMHO, there are differences between these two throttlings.
memory.high is a per-memcg throttle which targets to limit the memory
usage of the tasks in the cgroup. For the memory allocation speed throttle(MST),
the purpose is to avoid the memory burst in cgroup which would trigger
the global reclaim and affects the timing sensitive workloads in other cgroup.
For example, we have two pods with memory overcommit enabled, one includes
online tasks and the other has offline tasks, if we restrict the memory usage of
the offline pod with memory.high, it will lose the benefit of memory overcommit
when the other workloads are idle. On the other hand, if we don't
limit the memory
usage, it will easily break the system watermark when there suddenly has massive
memory operations. If enable MST in this case, we will be able to
avoid the direct
reclaim and leverage the overcommit.
.