linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Limit boost_watermark on small zones.
@ 2020-05-01  0:49 Henry Willard
  2020-05-01 22:57 ` Andrew Morton
  2020-05-05  8:30 ` David Hildenbrand
  0 siblings, 2 replies; 7+ messages in thread
From: Henry Willard @ 2020-05-01  0:49 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel

Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
fragmentation event occurs") adds a boost_watermark() function which
increases the min watermark in a zone by at least pageblock_nr_pages or
the number of pages in a page block. On Arm64, with 64K pages and 512M
huge pages, this is 8192 pages or 512M. It does this regardless of the
number of managed pages managed in the zone or the likelihood of success.
This can put the zone immediately under water in terms of allocating pages
from the zone, and can cause a small machine to fail immediately due to
OoM. Unlike set_recommended_min_free_kbytes(), which substantially
increases min_free_kbytes and is tied to THP, boost_watermark() can be
called even if THP is not active. The problem is most likely to appear
on architectures such as Arm64 where pageblock_nr_pages is very large.

It is desirable to run the kdump capture kernel in as small a space as
possible to avoid wasting memory. In some architectures, such as Arm64,
there are restrictions on where the capture kernel can run, and therefore,
the space available. A capture kernel running in 768M can fail due to OoM
immediately after boost_watermark() sets the min in zone DMA32, where
most of the memory is, to 512M. It fails even though there is over 500M of
free memory. With boost_watermark() suppressed, the capture kernel can run
successfully in 448M.

This patch limits boost_watermark() to boosting a zone's min watermark only
when there are enough pages that the boost will produce positive results.
In this case that is estimated to be four times as many pages as
pageblock_nr_pages.

Signed-off-by: Henry Willard <henry.willard@oracle.com>
---
 mm/page_alloc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 69827d4fa052..67805e794660 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2400,6 +2400,14 @@ static inline void boost_watermark(struct zone *zone)
 
 	if (!watermark_boost_factor)
 		return;
+	/*
+	 * Don't bother in zones that are unlikely to produce results.
+	 * On small machines, including kdump capture kernels running
+	 * in a small area, boosting the watermark can cause an out of
+	 * memory situation immediately.
+	 */
+	if ((pageblock_nr_pages * 4) > zone_managed_pages(zone))
+		return;
 
 	max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
 			watermark_boost_factor, 10000);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: Limit boost_watermark on small zones.
  2020-05-01  0:49 [PATCH] mm: Limit boost_watermark on small zones Henry Willard
@ 2020-05-01 22:57 ` Andrew Morton
  2020-05-04 12:44   ` Mel Gorman
  2020-05-05  8:30 ` David Hildenbrand
  1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2020-05-01 22:57 UTC (permalink / raw)
  To: Henry Willard; +Cc: linux-mm, linux-kernel, Mel Gorman

On Thu, 30 Apr 2020 17:49:08 -0700 Henry Willard <henry.willard@oracle.com> wrote:

> Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
> fragmentation event occurs") adds a boost_watermark() function which
> increases the min watermark in a zone by at least pageblock_nr_pages or
> the number of pages in a page block. On Arm64, with 64K pages and 512M
> huge pages, this is 8192 pages or 512M. It does this regardless of the
> number of managed pages managed in the zone or the likelihood of success.
> This can put the zone immediately under water in terms of allocating pages
> from the zone, and can cause a small machine to fail immediately due to
> OoM. Unlike set_recommended_min_free_kbytes(), which substantially
> increases min_free_kbytes and is tied to THP, boost_watermark() can be
> called even if THP is not active. The problem is most likely to appear
> on architectures such as Arm64 where pageblock_nr_pages is very large.
> 
> It is desirable to run the kdump capture kernel in as small a space as
> possible to avoid wasting memory. In some architectures, such as Arm64,
> there are restrictions on where the capture kernel can run, and therefore,
> the space available. A capture kernel running in 768M can fail due to OoM
> immediately after boost_watermark() sets the min in zone DMA32, where
> most of the memory is, to 512M. It fails even though there is over 500M of
> free memory. With boost_watermark() suppressed, the capture kernel can run
> successfully in 448M.
> 
> This patch limits boost_watermark() to boosting a zone's min watermark only
> when there are enough pages that the boost will produce positive results.
> In this case that is estimated to be four times as many pages as
> pageblock_nr_pages.
> 

Let's Cc Mel.

> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2400,6 +2400,14 @@ static inline void boost_watermark(struct zone *zone)
>  
>  	if (!watermark_boost_factor)
>  		return;
> +	/*
> +	 * Don't bother in zones that are unlikely to produce results.
> +	 * On small machines, including kdump capture kernels running
> +	 * in a small area, boosting the watermark can cause an out of
> +	 * memory situation immediately.
> +	 */
> +	if ((pageblock_nr_pages * 4) > zone_managed_pages(zone))
> +		return;
>  
>  	max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
>  			watermark_boost_factor, 10000);
> -- 
> 1.8.3.1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: Limit boost_watermark on small zones.
  2020-05-01 22:57 ` Andrew Morton
@ 2020-05-04 12:44   ` Mel Gorman
  2020-05-04 20:36     ` Andrew Morton
  0 siblings, 1 reply; 7+ messages in thread
From: Mel Gorman @ 2020-05-04 12:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Henry Willard, linux-mm, linux-kernel

On Fri, May 01, 2020 at 03:57:29PM -0700, Andrew Morton wrote:
> On Thu, 30 Apr 2020 17:49:08 -0700 Henry Willard <henry.willard@oracle.com> wrote:
> 
> > Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
> > fragmentation event occurs") adds a boost_watermark() function which
> > increases the min watermark in a zone by at least pageblock_nr_pages or
> > the number of pages in a page block. On Arm64, with 64K pages and 512M
> > huge pages, this is 8192 pages or 512M. It does this regardless of the
> > number of managed pages managed in the zone or the likelihood of success.
> > This can put the zone immediately under water in terms of allocating pages
> > from the zone, and can cause a small machine to fail immediately due to
> > OoM. Unlike set_recommended_min_free_kbytes(), which substantially
> > increases min_free_kbytes and is tied to THP, boost_watermark() can be
> > called even if THP is not active. The problem is most likely to appear
> > on architectures such as Arm64 where pageblock_nr_pages is very large.
> > 
> > It is desirable to run the kdump capture kernel in as small a space as
> > possible to avoid wasting memory. In some architectures, such as Arm64,
> > there are restrictions on where the capture kernel can run, and therefore,
> > the space available. A capture kernel running in 768M can fail due to OoM
> > immediately after boost_watermark() sets the min in zone DMA32, where
> > most of the memory is, to 512M. It fails even though there is over 500M of
> > free memory. With boost_watermark() suppressed, the capture kernel can run
> > successfully in 448M.
> > 
> > This patch limits boost_watermark() to boosting a zone's min watermark only
> > when there are enough pages that the boost will produce positive results.
> > In this case that is estimated to be four times as many pages as
> > pageblock_nr_pages.
> > 
> 
> Let's Cc Mel.
> 

Seems reasonable.

Acked-by: Mel Gorman <mgorman@techsingularity.net>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: Limit boost_watermark on small zones.
  2020-05-04 12:44   ` Mel Gorman
@ 2020-05-04 20:36     ` Andrew Morton
  2020-05-05  0:27       ` Henry Willard
  2020-05-05  7:58       ` Mel Gorman
  0 siblings, 2 replies; 7+ messages in thread
From: Andrew Morton @ 2020-05-04 20:36 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Henry Willard, linux-mm, linux-kernel

On Mon, 4 May 2020 13:44:09 +0100 Mel Gorman <mgorman@techsingularity.net> wrote:

> On Fri, May 01, 2020 at 03:57:29PM -0700, Andrew Morton wrote:
> > On Thu, 30 Apr 2020 17:49:08 -0700 Henry Willard <henry.willard@oracle.com> wrote:
> > 
> > > Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
> > > fragmentation event occurs") adds a boost_watermark() function which
> > > increases the min watermark in a zone by at least pageblock_nr_pages or
> > > the number of pages in a page block. On Arm64, with 64K pages and 512M
> > > huge pages, this is 8192 pages or 512M. It does this regardless of the
> > > number of managed pages managed in the zone or the likelihood of success.
> > > This can put the zone immediately under water in terms of allocating pages
> > > from the zone, and can cause a small machine to fail immediately due to
> > > OoM. Unlike set_recommended_min_free_kbytes(), which substantially
> > > increases min_free_kbytes and is tied to THP, boost_watermark() can be
> > > called even if THP is not active. The problem is most likely to appear
> > > on architectures such as Arm64 where pageblock_nr_pages is very large.
> > > 
> > > It is desirable to run the kdump capture kernel in as small a space as
> > > possible to avoid wasting memory. In some architectures, such as Arm64,
> > > there are restrictions on where the capture kernel can run, and therefore,
> > > the space available. A capture kernel running in 768M can fail due to OoM
> > > immediately after boost_watermark() sets the min in zone DMA32, where
> > > most of the memory is, to 512M. It fails even though there is over 500M of
> > > free memory. With boost_watermark() suppressed, the capture kernel can run
> > > successfully in 448M.
> > > 
> > > This patch limits boost_watermark() to boosting a zone's min watermark only
> > > when there are enough pages that the boost will produce positive results.
> > > In this case that is estimated to be four times as many pages as
> > > pageblock_nr_pages.
> > > 
> > 
> ...
> Acked-by: Mel Gorman <mgorman@techsingularity.net>

Cool.  I wonder if we should backport this into -stable kernels?  "can
cause a small machine to fail immediately" sounds serious, but
1c30844d2dfe is from December 2018.  Any thoughts?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: Limit boost_watermark on small zones.
  2020-05-04 20:36     ` Andrew Morton
@ 2020-05-05  0:27       ` Henry Willard
  2020-05-05  7:58       ` Mel Gorman
  1 sibling, 0 replies; 7+ messages in thread
From: Henry Willard @ 2020-05-05  0:27 UTC (permalink / raw)
  To: Andrew Morton, Mel Gorman; +Cc: linux-mm, linux-kernel

On 5/4/20 1:36 PM, Andrew Morton wrote:
> On Mon, 4 May 2020 13:44:09 +0100 Mel Gorman <mgorman@techsingularity.net> wrote:
>
>> On Fri, May 01, 2020 at 03:57:29PM -0700, Andrew Morton wrote:
>>> On Thu, 30 Apr 2020 17:49:08 -0700 Henry Willard <henry.willard@oracle.com> wrote:
>>>
>>>> Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
>>>> fragmentation event occurs") adds a boost_watermark() function which
>>>> increases the min watermark in a zone by at least pageblock_nr_pages or
>>>> the number of pages in a page block. On Arm64, with 64K pages and 512M
>>>> huge pages, this is 8192 pages or 512M. It does this regardless of the
>>>> number of managed pages managed in the zone or the likelihood of success.
>>>> This can put the zone immediately under water in terms of allocating pages
>>>> from the zone, and can cause a small machine to fail immediately due to
>>>> OoM. Unlike set_recommended_min_free_kbytes(), which substantially
>>>> increases min_free_kbytes and is tied to THP, boost_watermark() can be
>>>> called even if THP is not active. The problem is most likely to appear
>>>> on architectures such as Arm64 where pageblock_nr_pages is very large.
>>>>
>>>> It is desirable to run the kdump capture kernel in as small a space as
>>>> possible to avoid wasting memory. In some architectures, such as Arm64,
>>>> there are restrictions on where the capture kernel can run, and therefore,
>>>> the space available. A capture kernel running in 768M can fail due to OoM
>>>> immediately after boost_watermark() sets the min in zone DMA32, where
>>>> most of the memory is, to 512M. It fails even though there is over 500M of
>>>> free memory. With boost_watermark() suppressed, the capture kernel can run
>>>> successfully in 448M.
>>>>
>>>> This patch limits boost_watermark() to boosting a zone's min watermark only
>>>> when there are enough pages that the boost will produce positive results.
>>>> In this case that is estimated to be four times as many pages as
>>>> pageblock_nr_pages.
>>>>
>> ...
>> Acked-by: Mel Gorman <mgorman@techsingularity.net>
> Cool.  I wonder if we should backport this into -stable kernels?  "can
> cause a small machine to fail immediately" sounds serious, but
> 1c30844d2dfe is from December 2018.  Any thoughts?
It is a trivial patch, and we always like to have them in -stable 
kernels when possible. It was a serious problem for us, because we had a 
configuration where kdump on Arm always failed. However, other than 
kdump, the problem is probably relatively rare.

Thanks,
Henry


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: Limit boost_watermark on small zones.
  2020-05-04 20:36     ` Andrew Morton
  2020-05-05  0:27       ` Henry Willard
@ 2020-05-05  7:58       ` Mel Gorman
  1 sibling, 0 replies; 7+ messages in thread
From: Mel Gorman @ 2020-05-05  7:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Henry Willard, linux-mm, linux-kernel

On Mon, May 04, 2020 at 01:36:04PM -0700, Andrew Morton wrote:
> On Mon, 4 May 2020 13:44:09 +0100 Mel Gorman <mgorman@techsingularity.net> wrote:
> 
> > On Fri, May 01, 2020 at 03:57:29PM -0700, Andrew Morton wrote:
> > > On Thu, 30 Apr 2020 17:49:08 -0700 Henry Willard <henry.willard@oracle.com> wrote:
> > > 
> > > > Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
> > > > fragmentation event occurs") adds a boost_watermark() function which
> > > > increases the min watermark in a zone by at least pageblock_nr_pages or
> > > > the number of pages in a page block. On Arm64, with 64K pages and 512M
> > > > huge pages, this is 8192 pages or 512M. It does this regardless of the
> > > > number of managed pages managed in the zone or the likelihood of success.
> > > > This can put the zone immediately under water in terms of allocating pages
> > > > from the zone, and can cause a small machine to fail immediately due to
> > > > OoM. Unlike set_recommended_min_free_kbytes(), which substantially
> > > > increases min_free_kbytes and is tied to THP, boost_watermark() can be
> > > > called even if THP is not active. The problem is most likely to appear
> > > > on architectures such as Arm64 where pageblock_nr_pages is very large.
> > > > 
> > > > It is desirable to run the kdump capture kernel in as small a space as
> > > > possible to avoid wasting memory. In some architectures, such as Arm64,
> > > > there are restrictions on where the capture kernel can run, and therefore,
> > > > the space available. A capture kernel running in 768M can fail due to OoM
> > > > immediately after boost_watermark() sets the min in zone DMA32, where
> > > > most of the memory is, to 512M. It fails even though there is over 500M of
> > > > free memory. With boost_watermark() suppressed, the capture kernel can run
> > > > successfully in 448M.
> > > > 
> > > > This patch limits boost_watermark() to boosting a zone's min watermark only
> > > > when there are enough pages that the boost will produce positive results.
> > > > In this case that is estimated to be four times as many pages as
> > > > pageblock_nr_pages.
> > > > 
> > > 
> > ...
> > Acked-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Cool.  I wonder if we should backport this into -stable kernels?  "can
> cause a small machine to fail immediately" sounds serious, but
> 1c30844d2dfe is from December 2018.  Any thoughts?

There is no harm in marking it stable. Clearly it does not happen very
often but it's not impossible. 32-bit x86 is a lot less common now which
would previously have been vulnerable to triggering this easily. ppc64
has a larger base page size but typically only has one zone. arm64 is
likely the most vulnerable, particularly when CMA is configured with a
small movable zone.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: Limit boost_watermark on small zones.
  2020-05-01  0:49 [PATCH] mm: Limit boost_watermark on small zones Henry Willard
  2020-05-01 22:57 ` Andrew Morton
@ 2020-05-05  8:30 ` David Hildenbrand
  1 sibling, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2020-05-05  8:30 UTC (permalink / raw)
  To: Henry Willard, akpm; +Cc: linux-mm, linux-kernel

On 01.05.20 02:49, Henry Willard wrote:
> Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
> fragmentation event occurs") adds a boost_watermark() function which
> increases the min watermark in a zone by at least pageblock_nr_pages or
> the number of pages in a page block. On Arm64, with 64K pages and 512M
> huge pages, this is 8192 pages or 512M. It does this regardless of the
> number of managed pages managed in the zone or the likelihood of success.
> This can put the zone immediately under water in terms of allocating pages
> from the zone, and can cause a small machine to fail immediately due to
> OoM. Unlike set_recommended_min_free_kbytes(), which substantially
> increases min_free_kbytes and is tied to THP, boost_watermark() can be
> called even if THP is not active. The problem is most likely to appear
> on architectures such as Arm64 where pageblock_nr_pages is very large.
> 
> It is desirable to run the kdump capture kernel in as small a space as
> possible to avoid wasting memory. In some architectures, such as Arm64,
> there are restrictions on where the capture kernel can run, and therefore,
> the space available. A capture kernel running in 768M can fail due to OoM
> immediately after boost_watermark() sets the min in zone DMA32, where
> most of the memory is, to 512M. It fails even though there is over 500M of
> free memory. With boost_watermark() suppressed, the capture kernel can run
> successfully in 448M.
> 
> This patch limits boost_watermark() to boosting a zone's min watermark only
> when there are enough pages that the boost will produce positive results.
> In this case that is estimated to be four times as many pages as
> pageblock_nr_pages.
> 
> Signed-off-by: Henry Willard <henry.willard@oracle.com>

Reviewed-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-05-05  8:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01  0:49 [PATCH] mm: Limit boost_watermark on small zones Henry Willard
2020-05-01 22:57 ` Andrew Morton
2020-05-04 12:44   ` Mel Gorman
2020-05-04 20:36     ` Andrew Morton
2020-05-05  0:27       ` Henry Willard
2020-05-05  7:58       ` Mel Gorman
2020-05-05  8:30 ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).