linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, david@redhat.com,
	henry.willard@oracle.com, linux-mm@kvack.org,
	mgorman@techsingularity.net, mm-commits@vger.kernel.org,
	stable@vger.kernel.org, torvalds@linux-foundation.org,
	vbabka@suse.cz
Subject: [patch 15/15] mm: limit boost_watermark on small zones
Date: Thu, 07 May 2020 18:36:27 -0700	[thread overview]
Message-ID: <20200508013627.uDh87VOZJ%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200507183509.c5ef146c5aaeb118a25a39a8@linux-foundation.org>

From: Henry Willard <henry.willard@oracle.com>
Subject: mm: limit boost_watermark on small zones

Commit 1c30844d2dfe ("mm: reclaim small amounts of memory when an external
fragmentation event occurs") adds a boost_watermark() function which
increases the min watermark in a zone by at least pageblock_nr_pages or
the number of pages in a page block. On Arm64, with 64K pages and 512M
huge pages, this is 8192 pages or 512M. It does this regardless of the
number of managed pages managed in the zone or the likelihood of success.
This can put the zone immediately under water in terms of allocating pages
from the zone, and can cause a small machine to fail immediately due to
OoM. Unlike set_recommended_min_free_kbytes(), which substantially
increases min_free_kbytes and is tied to THP, boost_watermark() can be
called even if THP is not active. The problem is most likely to appear
on architectures such as Arm64 where pageblock_nr_pages is very large.

It is desirable to run the kdump capture kernel in as small a space as
possible to avoid wasting memory. In some architectures, such as Arm64,
there are restrictions on where the capture kernel can run, and therefore,
the space available. A capture kernel running in 768M can fail due to OoM
immediately after boost_watermark() sets the min in zone DMA32, where
most of the memory is, to 512M. It fails even though there is over 500M of
free memory. With boost_watermark() suppressed, the capture kernel can run
successfully in 448M.

This patch limits boost_watermark() to boosting a zone's min watermark only
when there are enough pages that the boost will produce positive results.
In this case that is estimated to be four times as many pages as
pageblock_nr_pages.

Mel said:

: There is no harm in marking it stable.  Clearly it does not happen very
: often but it's not impossible.  32-bit x86 is a lot less common now
: which would previously have been vulnerable to triggering this easily. 
: ppc64 has a larger base page size but typically only has one zone. 
: arm64 is likely the most vulnerable, particularly when CMA is
: configured with a small movable zone.

Link: http://lkml.kernel.org/r/1588294148-6586-1-git-send-email-henry.willard@oracle.com
Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs")
Signed-off-by: Henry Willard <henry.willard@oracle.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/page_alloc.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- a/mm/page_alloc.c~mm-limit-boost_watermark-on-small-zones
+++ a/mm/page_alloc.c
@@ -2401,6 +2401,14 @@ static inline void boost_watermark(struc
 
 	if (!watermark_boost_factor)
 		return;
+	/*
+	 * Don't bother in zones that are unlikely to produce results.
+	 * On small machines, including kdump capture kernels running
+	 * in a small area, boosting the watermark can cause an out of
+	 * memory situation immediately.
+	 */
+	if ((pageblock_nr_pages * 4) > zone_managed_pages(zone))
+		return;
 
 	max_boost = mult_frac(zone->_watermark[WMARK_HIGH],
 			watermark_boost_factor, 10000);
_


  parent reply	other threads:[~2020-05-08  1:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-08  1:35 incoming Andrew Morton
2020-05-08  1:35 ` [patch 01/15] ipc/mqueue.c: change __do_notify() to bypass check_kill_permission() Andrew Morton
2020-05-08  1:35 ` [patch 02/15] mm, memcg: fix error return value of mem_cgroup_css_alloc() Andrew Morton
2020-05-08  1:35 ` [patch 03/15] mm/page_alloc: fix watchdog soft lockups during set_zone_contiguous() Andrew Morton
2020-05-08  1:35 ` [patch 04/15] kernel/kcov.c: fix typos in kcov_remote_start documentation Andrew Morton
2020-05-08  1:35 ` [patch 05/15] scripts/decodecode: fix trapping instruction formatting Andrew Morton
2020-05-08  1:35 ` [patch 06/15] arch/x86/kvm/svm/sev.c: change flag passed to GUP fast in sev_pin_memory() Andrew Morton
2020-05-08  1:35 ` [patch 07/15] eventpoll: fix missing wakeup for ovflist in ep_poll_callback Andrew Morton
2020-05-08  1:36 ` [patch 08/15] scripts/gdb: repair rb_first() and rb_last() Andrew Morton
2020-05-08  1:36 ` [patch 09/15] mm/slub: fix incorrect interpretation of s->offset Andrew Morton
2020-05-08  1:36 ` [patch 10/15] percpu: make pcpu_alloc() aware of current gfp context Andrew Morton
2020-05-08  1:36 ` [patch 11/15] kselftests: introduce new epoll60 testcase for catching lost wakeups Andrew Morton
2020-05-08  1:36 ` [patch 12/15] epoll: atomically remove wait entry on wake up Andrew Morton
2020-05-08  1:36 ` [patch 13/15] mm/vmscan: remove unnecessary argument description of isolate_lru_pages() Andrew Morton
2020-05-08  1:36 ` [patch 14/15] ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST Andrew Morton
2020-05-08  1:36 ` Andrew Morton [this message]
2020-05-11 22:44 ` mmotm 2020-05-11-15-43 uploaded Andrew Morton
2020-05-12  2:12   ` mmotm 2020-05-11-15-43 uploaded (ethernet/ti/ti_cpsw) Randy Dunlap
2020-05-13  9:20     ` Grygorii Strashko
2020-05-13 15:18       ` Randy Dunlap
2020-05-12  4:41   ` mmotm 2020-05-11-15-43 uploaded (mm/memcontrol.c, huge pages) Randy Dunlap
2020-05-12 12:17     ` Johannes Weiner
2020-05-12 15:27       ` Randy Dunlap
2020-05-12 15:37       ` Naresh Kamboju
2020-05-12 17:16       ` Geert Uytterhoeven
2020-05-12 15:11     ` Geert Uytterhoeven

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200508013627.uDh87VOZJ%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=henry.willard@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mm-commits@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).