From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7337C10F27 for ; Mon, 9 Mar 2020 15:33:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5776820727 for ; Mon, 9 Mar 2020 15:33:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mSGgUEa1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5776820727 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DEA686B0005; Mon, 9 Mar 2020 11:33:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D9AD26B0006; Mon, 9 Mar 2020 11:33:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB0A46B0007; Mon, 9 Mar 2020 11:33:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0048.hostedemail.com [216.40.44.48]) by kanga.kvack.org (Postfix) with ESMTP id B134A6B0005 for ; Mon, 9 Mar 2020 11:33:03 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6C552824999B for ; Mon, 9 Mar 2020 15:33:03 +0000 (UTC) X-FDA: 76576217046.14.twist77_8f0f1de254f11 X-HE-Tag: twist77_8f0f1de254f11 X-Filterd-Recvd-Size: 10859 Received: from mail-lj1-f194.google.com (mail-lj1-f194.google.com [209.85.208.194]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Mon, 9 Mar 2020 15:33:02 +0000 (UTC) Received: by mail-lj1-f194.google.com with SMTP id a10so10384317ljp.11 for ; Mon, 09 Mar 2020 08:33:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=d4wi0JlMY57VyrMV1zTTW7nUlFvvo+osurrmezq99SA=; b=mSGgUEa1sgT6KM4ILbaByNN00/5fhbE00cnfX/vROuZtP7izgCBWWQUWhLtPAYF9zo bUB2XLSGtehUAmRUxJRz9xhDg99b+VtiJdOmAGeLwh6QKuzgzrjuOY915AwPMPqajoxF J/M7Hn3TKvlD5uNI3Xl+ZNUMCvnfTSZCc4GEYsaitrv2keBInyTqLjsrr6u7r0Zvf2xd 3ZCVJfeMADwIJkPlqSWdTmukdc3jHSaruVcGVjYDBvep+T7vBnD1xJWLma5HNLSrJrtm Q10EzldMbHXWNhlClT19Dk9M15Atbh6xhzFTn48dBWA1HC3DELArd+UX9KkAl4WvW401 3dgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=d4wi0JlMY57VyrMV1zTTW7nUlFvvo+osurrmezq99SA=; b=tRhQ9QKqpP4qPDtHkU7yC5mRYgm2E2hqV56O22qqlMA+uE/x+ldoXTmclC0NdDoiQh J8/StRSqc/WX18bYvyTLEvCgh35HY7WZg4KMr12W0L9waVcS+ZdG8G/DjN+9UtPese4q zbwl2nws9zxfh/Z2IxYbEVQA3tpuJmPEiyEaGpjKjcC8npxR8TcCjyWZOnCGX1p+rHdP BOD9iosKM/JvJC6QHO9uyr8A6OojGKBmSl8FEBCtE3q52BZTYfDZixq3bwUNDVDBiBwt 27kr/YQoTxyJDyZDyCyJ4n4tbzC+2bwwkLeHAy60u5FzShKJvpHJVRXsLNzxnREoTBek 29Zg== X-Gm-Message-State: ANhLgQ2O3/QnR9LRLpA8GcRqgRhVQ2XVwydLPRInfE/dH/8lwpPHEzLv RODdkO7462eTgmdKT3TguO3xH4ABVfV00Kci3hU= X-Google-Smtp-Source: ADFU+vuiup6anQGunCBgwA7FbrouZaajdIkNGACV6N5BWru4q0saZfUoR6rtFPflSakkt54b3jo1YC6DyB96Pb+oBKk= X-Received: by 2002:a2e:8790:: with SMTP id n16mr9822337lji.190.1583767981378; Mon, 09 Mar 2020 08:33:01 -0700 (PDT) MIME-Version: 1.0 References: <20200309113141.167289-1-shaju.abraham@nutanix.com> <20200309115818.GK8447@dhcp22.suse.cz> In-Reply-To: <20200309115818.GK8447@dhcp22.suse.cz> From: Shaju Abraham Date: Mon, 9 Mar 2020 21:02:50 +0530 Message-ID: Subject: Re: [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure To: Michal Hocko Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shaju Abraham Content-Type: multipart/alternative; boundary="000000000000efd21a05a06db7b4" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --000000000000efd21a05a06db7b4 Content-Type: text/plain; charset="UTF-8" On Mon, Mar 9, 2020 at 5:28 PM Michal Hocko wrote: > On Mon 09-03-20 11:31:41, Shaju Abraham wrote: > > The VM pressure notification flags have excluded GFP_KERNEL with the > > reasoning that user land will not be able to take any action in case of > > kernel memory being low. This is not true always. Consider the case of > > a user land program managing all the huge memory pages. By including > > GFP_KERNEL flag whenever the kernel memory is low, pressure notification > > can be send, and the manager process can split huge pages to satisfy > kernel > > memory requirement. > > Are you sure about this reasoning? GFP_KERNEL = __GFP_FS | __GFP_IO | > __GFP_RECLAIM > Two of the flags mentioned there are already listed so we are talking > about __GFP_RECLAIM here. Including it here would be a more appropriate > change than GFP_KERNEL btw. > > But still I do not really understand what is the actual problem and how > is this patch meant to fix it. vmpressure is triggered only from the > reclaim path which inherently requires to have __GFP_RECLAIM present > so I fail to see how this can make any change at all. How have you > tested it? > > We have a user space application which waits on memory pressure events. Upon receiving the event, the user space program will free up huge pages to make more memory available in the system. This mechanism works fine if the memory is being consumed by other user space applications. To test this, we wrote a test program which will allocate all the memory available in the system using malloc() and touch the allocated pages. When the free memory level becomes low, the pressure event is fired and the process gets notified about it . The same test is repeated with kmalloc() instead of malloc(). A test kernel module is developed, which will allocate all the available memory with kmalloc(GFP_KERNEL) flag. The OOM killer gets invoked in this case. The memory pressure event is not fired. After modifying the vmpressure.c with the attached patch, the pressure event gets triggered. Swap is disabled in the system we were testing. Regards Shaju > > This is a common scanario in cloud. Most of the host memory is reserved > > as hugepages and can be broken down to small pages on demand. This is > > done to minimise fragmentation so that Virtual Machine power on will be > > successful always. > > > > Signed-off-by: Shaju Abraham > > --- > > mm/vmpressure.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/vmpressure.c b/mm/vmpressure.c > > index 4bac22fe1aa2..7ccfb3dd8173 100644 > > --- a/mm/vmpressure.c > > +++ b/mm/vmpressure.c > > @@ -253,7 +253,8 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, > bool tree, > > * Indirect reclaim (kswapd) sets sc->gfp_mask to GFP_KERNEL, so > > * we account it too. > > */ > > - if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS))) > > + if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | > > + __GFP_FS | GFP_KERNEL))) > > return; > > > > /* > > -- > > 2.20.1 > > > > -- > Michal Hocko > SUSE Labs > --000000000000efd21a05a06db7b4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Mon, Mar 9, 2020 at 5:28 PM Michal= Hocko <mhocko@kernel.org> w= rote:
On Mon 09-= 03-20 11:31:41, Shaju Abraham wrote:
> The VM pressure notification flags have excluded GFP_KERNEL with the > reasoning that user land will not be able to take any action in case o= f
> kernel memory being low. This is not true always. Consider the case of=
> a user land program managing all the huge memory pages. By including > GFP_KERNEL flag whenever the kernel memory is low, pressure notificati= on
> can be send, and the manager process can split huge pages to satisfy k= ernel
> memory requirement.

Are you sure about this reasoning? GFP_KERNEL =3D __GFP_FS | __GFP_IO | __G= FP_RECLAIM
Two of the flags mentioned there are already listed so we are talking
about __GFP_RECLAIM here. Including it here would be a more appropriate
change than GFP_KERNEL btw.

But still I do not really understand what is the actual problem and how
is this patch meant to fix it. vmpressure is triggered only from the
reclaim path which inherently requires to have __GFP_RECLAIM present
so I fail to see how this can make any change at all. How have you
tested it?

=C2=A0 =C2=A0We have a user space application which w= aits on memory pressure events. Upon receiving the
=C2=A0 event, = the user space program will free up huge pages to make more memory availabl= e in the
=C2=A0 system.
=C2=A0 This mechanism works fin= e if the memory is being consumed by other user space applications. To
=C2=A0 test this, we wrote a test program which will allocate all the= memory available in the system using
=C2=A0 malloc() and touch t= he allocated pages. When the free memory level becomes low, the pressure ev= ent=C2=A0
=C2=A0 is fired and the process gets notified about it = .
=C2=A0 The same test is repeated with kmalloc() instead of mall= oc(). A test kernel=C2=A0 module is developed, which=C2=A0
=C2=A0= will allocate all the=C2=A0 =C2=A0available=C2=A0 =C2=A0memory with kmallo= c(GFP_KERNEL) flag.=C2=A0 The OOM killer gets invoked in=C2=A0
= =C2=A0 this case. The memory pressure event is not fired.
=C2=A0 = After modifying the vmpressure.c with the attached patch, the pressure even= t gets triggered.=C2=A0
=C2=A0 Swap is disabled in the system we = were testing.

=C2=A0Regards
=C2=A0Shaju<= /div>
=C2=A0
> This is a common scanario in cloud. Most of the host memory is reserve= d
> as hugepages and can be broken down to small pages on demand. This is<= br> > done to minimise fragmentation so that Virtual Machine power on will b= e
> successful always.
>
> Signed-off-by: Shaju Abraham <shaju.abraham@nutanix.com>
> ---
>=C2=A0 mm/vmpressure.c | 3 ++-
>=C2=A0 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 4bac22fe1aa2..7ccfb3dd8173 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -253,7 +253,8 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memc= g, bool tree,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 * Indirect reclaim (kswapd) sets sc->gfp= _mask to GFP_KERNEL, so
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 * we account it too.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 */
> -=C2=A0 =C2=A0 =C2=A0if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | = __GFP_IO | __GFP_FS)))
> +=C2=A0 =C2=A0 =C2=A0if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | = __GFP_IO |
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 __GFP_= FS | GFP_KERNEL)))
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0return;
>=C2=A0
>=C2=A0 =C2=A0 =C2=A0 =C2=A0/*
> --
> 2.20.1
>

--
Michal Hocko
SUSE Labs
--000000000000efd21a05a06db7b4--