linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure
@ 2020-03-09 11:31 Shaju Abraham
  2020-03-09 11:58 ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Shaju Abraham @ 2020-03-09 11:31 UTC (permalink / raw)
  Cc: akpm, linux-mm, linux-kernel, shajunutanix, Shaju Abraham

The VM pressure notification flags have excluded GFP_KERNEL with the
reasoning that user land will not be able to take any action in case of
kernel memory being low. This is not true always. Consider the case of
a user land program managing all the huge memory pages. By including
GFP_KERNEL flag whenever the kernel memory is low, pressure notification
can be send, and the manager process can split huge pages to satisfy kernel
memory requirement.
This is a common scanario in cloud. Most of the host memory is reserved
as hugepages and can be broken down to small pages on demand. This is
done to minimise fragmentation so that Virtual Machine power on will be
successful always.

Signed-off-by: Shaju Abraham <shaju.abraham@nutanix.com>
---
 mm/vmpressure.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 4bac22fe1aa2..7ccfb3dd8173 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -253,7 +253,8 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
 	 * Indirect reclaim (kswapd) sets sc->gfp_mask to GFP_KERNEL, so
 	 * we account it too.
 	 */
-	if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS)))
+	if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO |
+		     __GFP_FS | GFP_KERNEL)))
 		return;
 
 	/*
-- 
2.20.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure
  2020-03-09 11:31 [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure Shaju Abraham
@ 2020-03-09 11:58 ` Michal Hocko
  2020-03-09 15:32   ` Shaju Abraham
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2020-03-09 11:58 UTC (permalink / raw)
  To: Shaju Abraham; +Cc: akpm, linux-mm, linux-kernel, Shaju Abraham

On Mon 09-03-20 11:31:41, Shaju Abraham wrote:
> The VM pressure notification flags have excluded GFP_KERNEL with the
> reasoning that user land will not be able to take any action in case of
> kernel memory being low. This is not true always. Consider the case of
> a user land program managing all the huge memory pages. By including
> GFP_KERNEL flag whenever the kernel memory is low, pressure notification
> can be send, and the manager process can split huge pages to satisfy kernel
> memory requirement.

Are you sure about this reasoning? GFP_KERNEL = __GFP_FS | __GFP_IO | __GFP_RECLAIM
Two of the flags mentioned there are already listed so we are talking
about __GFP_RECLAIM here. Including it here would be a more appropriate
change than GFP_KERNEL btw.

But still I do not really understand what is the actual problem and how
is this patch meant to fix it. vmpressure is triggered only from the
reclaim path which inherently requires to have __GFP_RECLAIM present
so I fail to see how this can make any change at all. How have you
tested it?

> This is a common scanario in cloud. Most of the host memory is reserved
> as hugepages and can be broken down to small pages on demand. This is
> done to minimise fragmentation so that Virtual Machine power on will be
> successful always.
> 
> Signed-off-by: Shaju Abraham <shaju.abraham@nutanix.com>
> ---
>  mm/vmpressure.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> index 4bac22fe1aa2..7ccfb3dd8173 100644
> --- a/mm/vmpressure.c
> +++ b/mm/vmpressure.c
> @@ -253,7 +253,8 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
>  	 * Indirect reclaim (kswapd) sets sc->gfp_mask to GFP_KERNEL, so
>  	 * we account it too.
>  	 */
> -	if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS)))
> +	if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO |
> +		     __GFP_FS | GFP_KERNEL)))
>  		return;
>  
>  	/*
> -- 
> 2.20.1
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure
  2020-03-09 11:58 ` Michal Hocko
@ 2020-03-09 15:32   ` Shaju Abraham
  2020-03-09 16:12     ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Shaju Abraham @ 2020-03-09 15:32 UTC (permalink / raw)
  To: Michal Hocko; +Cc: akpm, linux-mm, linux-kernel, Shaju Abraham

[-- Attachment #1: Type: text/plain, Size: 3209 bytes --]

On Mon, Mar 9, 2020 at 5:28 PM Michal Hocko <mhocko@kernel.org> wrote:

> On Mon 09-03-20 11:31:41, Shaju Abraham wrote:
> > The VM pressure notification flags have excluded GFP_KERNEL with the
> > reasoning that user land will not be able to take any action in case of
> > kernel memory being low. This is not true always. Consider the case of
> > a user land program managing all the huge memory pages. By including
> > GFP_KERNEL flag whenever the kernel memory is low, pressure notification
> > can be send, and the manager process can split huge pages to satisfy
> kernel
> > memory requirement.
>
> Are you sure about this reasoning? GFP_KERNEL = __GFP_FS | __GFP_IO |
> __GFP_RECLAIM
> Two of the flags mentioned there are already listed so we are talking
> about __GFP_RECLAIM here. Including it here would be a more appropriate
> change than GFP_KERNEL btw.
>
> But still I do not really understand what is the actual problem and how
> is this patch meant to fix it. vmpressure is triggered only from the
> reclaim path which inherently requires to have __GFP_RECLAIM present
> so I fail to see how this can make any change at all. How have you
> tested it?
>
>    We have a user space application which waits on memory pressure events.
Upon receiving the
  event, the user space program will free up huge pages to make more memory
available in the
  system.
  This mechanism works fine if the memory is being consumed by other user
space applications. To
  test this, we wrote a test program which will allocate all the memory
available in the system using
  malloc() and touch the allocated pages. When the free memory level
becomes low, the pressure event
  is fired and the process gets notified about it .
  The same test is repeated with kmalloc() instead of malloc(). A test
kernel  module is developed, which
  will allocate all the   available   memory with kmalloc(GFP_KERNEL)
flag.  The OOM killer gets invoked in
  this case. The memory pressure event is not fired.
  After modifying the vmpressure.c with the attached patch, the pressure
event gets triggered.
  Swap is disabled in the system we were testing.

 Regards
 Shaju


> > This is a common scanario in cloud. Most of the host memory is reserved
> > as hugepages and can be broken down to small pages on demand. This is
> > done to minimise fragmentation so that Virtual Machine power on will be
> > successful always.
> >
> > Signed-off-by: Shaju Abraham <shaju.abraham@nutanix.com>
> > ---
> >  mm/vmpressure.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/vmpressure.c b/mm/vmpressure.c
> > index 4bac22fe1aa2..7ccfb3dd8173 100644
> > --- a/mm/vmpressure.c
> > +++ b/mm/vmpressure.c
> > @@ -253,7 +253,8 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg,
> bool tree,
> >        * Indirect reclaim (kswapd) sets sc->gfp_mask to GFP_KERNEL, so
> >        * we account it too.
> >        */
> > -     if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO | __GFP_FS)))
> > +     if (!(gfp & (__GFP_HIGHMEM | __GFP_MOVABLE | __GFP_IO |
> > +                  __GFP_FS | GFP_KERNEL)))
> >               return;
> >
> >       /*
> > --
> > 2.20.1
> >
>
> --
> Michal Hocko
> SUSE Labs
>

[-- Attachment #2: Type: text/html, Size: 4219 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure
  2020-03-09 15:32   ` Shaju Abraham
@ 2020-03-09 16:12     ` Michal Hocko
  2020-03-10  7:39       ` Shaju Abraham
  0 siblings, 1 reply; 5+ messages in thread
From: Michal Hocko @ 2020-03-09 16:12 UTC (permalink / raw)
  To: Shaju Abraham; +Cc: akpm, linux-mm, linux-kernel, Shaju Abraham

On Mon 09-03-20 21:02:50, Shaju Abraham wrote:
> On Mon, Mar 9, 2020 at 5:28 PM Michal Hocko <mhocko@kernel.org> wrote:
> 
> > On Mon 09-03-20 11:31:41, Shaju Abraham wrote:
> > > The VM pressure notification flags have excluded GFP_KERNEL with the
> > > reasoning that user land will not be able to take any action in case of
> > > kernel memory being low. This is not true always. Consider the case of
> > > a user land program managing all the huge memory pages. By including
> > > GFP_KERNEL flag whenever the kernel memory is low, pressure notification
> > > can be send, and the manager process can split huge pages to satisfy
> > kernel
> > > memory requirement.
> >
> > Are you sure about this reasoning? GFP_KERNEL = __GFP_FS | __GFP_IO |
> > __GFP_RECLAIM
> > Two of the flags mentioned there are already listed so we are talking
> > about __GFP_RECLAIM here. Including it here would be a more appropriate
> > change than GFP_KERNEL btw.
> >
> > But still I do not really understand what is the actual problem and how
> > is this patch meant to fix it. vmpressure is triggered only from the
> > reclaim path which inherently requires to have __GFP_RECLAIM present
> > so I fail to see how this can make any change at all. How have you
> > tested it?
> >
> >    We have a user space application which waits on memory pressure events.

> Upon receiving the event, the user space program will free up huge
> pages to make more memory available in the system.  This mechanism
> works fine if the memory is being consumed by other user space
> applications. To test this, we wrote a test program which will
> allocate all the memory available in the system using malloc() and
> touch the allocated pages. When the free memory level becomes low,
> the pressure event is fired and the process gets notified about it .
> The same test is repeated with kmalloc() instead of malloc(). A test
> kernel module is developed, which will allocate all the available
> memory with kmalloc(GFP_KERNEL) flag.  The OOM killer gets invoked in
> this case. The memory pressure event is not fired.  After modifying
> the vmpressure.c with the attached patch, the pressure event gets
> triggered.  Swap is disabled in the system we were testing.

Are you sure this is really the case? I am either missing something here
or your test might simply be timing specific because

	GFP_KERNEL & (__GFP_FS | __GFP_IO) = true

so I really do not see how the current code could bail out on the test
you are patching so that the patch would make any change. The only real
difference this patch makes is to trigger events for __GFP_RECLAIM
allocations which could be GFP_NOIO. All non-sleepable allocations would
wake kswapd and that would in turn reclaim with _GFP_FS | __GFP_IO set
so the check doesn't change anything.

Am I missing something?
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure
  2020-03-09 16:12     ` Michal Hocko
@ 2020-03-10  7:39       ` Shaju Abraham
  0 siblings, 0 replies; 5+ messages in thread
From: Shaju Abraham @ 2020-03-10  7:39 UTC (permalink / raw)
  To: Michal Hocko; +Cc: akpm, linux-mm, linux-kernel, Shaju Abraham

On Mon, Mar 9, 2020 at 9:42 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Mon 09-03-20 21:02:50, Shaju Abraham wrote:
> > On Mon, Mar 9, 2020 at 5:28 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > > On Mon 09-03-20 11:31:41, Shaju Abraham wrote:
> > > > The VM pressure notification flags have excluded GFP_KERNEL with the
> > > > reasoning that user land will not be able to take any action in case of
> > > > kernel memory being low. This is not true always. Consider the case of
> > > > a user land program managing all the huge memory pages. By including
> > > > GFP_KERNEL flag whenever the kernel memory is low, pressure notification
> > > > can be send, and the manager process can split huge pages to satisfy
> > > kernel
> > > > memory requirement.
> > >
> > > Are you sure about this reasoning? GFP_KERNEL = __GFP_FS | __GFP_IO |
> > > __GFP_RECLAIM
> > > Two of the flags mentioned there are already listed so we are talking
> > > about __GFP_RECLAIM here. Including it here would be a more appropriate
> > > change than GFP_KERNEL btw.
> > >
> > > But still I do not really understand what is the actual problem and how
> > > is this patch meant to fix it. vmpressure is triggered only from the
> > > reclaim path which inherently requires to have __GFP_RECLAIM present
> > > so I fail to see how this can make any change at all. How have you
> > > tested it?
> > >
> > >    We have a user space application which waits on memory pressure events.
>
> > Upon receiving the event, the user space program will free up huge
> > pages to make more memory available in the system.  This mechanism
> > works fine if the memory is being consumed by other user space
> > applications. To test this, we wrote a test program which will
> > allocate all the memory available in the system using malloc() and
> > touch the allocated pages. When the free memory level becomes low,
> > the pressure event is fired and the process gets notified about it .
> > The same test is repeated with kmalloc() instead of malloc(). A test
> > kernel module is developed, which will allocate all the available
> > memory with kmalloc(GFP_KERNEL) flag.  The OOM killer gets invoked in
> > this case. The memory pressure event is not fired.  After modifying
> > the vmpressure.c with the attached patch, the pressure event gets
> > triggered.  Swap is disabled in the system we were testing.
>
> Are you sure this is really the case? I am either missing something here
> or your test might simply be timing specific because
>
>         GFP_KERNEL & (__GFP_FS | __GFP_IO) = true
>
> so I really do not see how the current code could bail out on the test
> you are patching so that the patch would make any change. The only real
> difference this patch makes is to trigger events for __GFP_RECLAIM
> allocations which could be GFP_NOIO. All non-sleepable allocations would
> wake kswapd and that would in turn reclaim with _GFP_FS | __GFP_IO set
> so the check doesn't change anything.
>
> Am I missing something?
 No . You are right. The pressure event does get generated from kernel
but before the
 user space gets time to act, OOM killer is invoked.

Regards
Shaju



> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-03-10  7:39 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-09 11:31 [PATCH] mm/vmpressure.c: Include GFP_KERNEL flag to vmpressure Shaju Abraham
2020-03-09 11:58 ` Michal Hocko
2020-03-09 15:32   ` Shaju Abraham
2020-03-09 16:12     ` Michal Hocko
2020-03-10  7:39       ` Shaju Abraham

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).