linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Sasha Levin <sashal@kernel.org>,
	linux-mm@kvack.org
Subject: [PATCH AUTOSEL 4.19 34/36] mm, memory_hotplug: check zone_movable in has_unmovable_pages
Date: Thu, 22 Nov 2018 14:52:38 -0500	[thread overview]
Message-ID: <20181122195240.13123-34-sashal@kernel.org> (raw)
In-Reply-To: <20181122195240.13123-1-sashal@kernel.org>

From: Michal Hocko <mhocko@suse.com>

[ Upstream commit 9d7899999c62c1a81129b76d2a6ecbc4655e1597 ]

Page state checks are racy.  Under a heavy memory workload (e.g.  stress
-m 200 -t 2h) it is quite easy to hit a race window when the page is
allocated but its state is not fully populated yet.  A debugging patch to
dump the struct page state shows

  has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
  page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
  flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)

Note that the state has been checked for both PageLRU and PageSwapBacked
already.  Closing this race completely would require some sort of retry
logic.  This can be tricky and error prone (think of potential endless
or long taking loops).

Workaround this problem for movable zones at least.  Such a zone should
only contain movable pages.  Commit 15c30bc09085 ("mm, memory_hotplug:
make has_unmovable_pages more robust") has told us that this is not
strictly true though.  Bootmem pages should be marked reserved though so
we can move the original check after the PageReserved check.  Pages from
other zones are still prone to races but we even do not pretend that
memory hotremove works for those so pre-mature failure doesn't hurt that
much.

Link: http://lkml.kernel.org/r/20181106095524.14629-1-mhocko@kernel.org
Fixes: 15c30bc09085 ("mm, memory_hotplug: make has_unmovable_pages more robust")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Baoquan He <bhe@redhat.com>
Tested-by: Baoquan He <bhe@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/page_alloc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e2ef1c17942f..3a4065312938 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7690,6 +7690,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		if (PageReserved(page))
 			goto unmovable;
 
+		/*
+		 * If the zone is movable and we have ruled out all reserved
+		 * pages then it should be reasonably safe to assume the rest
+		 * is movable.
+		 */
+		if (zone_idx(zone) == ZONE_MOVABLE)
+			continue;
+
 		/*
 		 * Hugepages are not in LRU lists, but they're movable.
 		 * We need not scan over tail pages bacause we don't
-- 
2.17.1


  parent reply	other threads:[~2018-11-22 19:54 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-22 19:52 [PATCH AUTOSEL 4.19 01/36] pinctrl: meson: fix pinconf bias disable Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 02/36] pinctrl: meson: fix gxbb ao pull register bits Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 03/36] pinctrl: meson: fix gxl " Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 04/36] pinctrl: meson: fix meson8 " Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 05/36] pinctrl: meson: fix meson8b " Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 06/36] tools/testing/nvdimm: Fix the array size for dimm devices Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 07/36] scsi: lpfc: fix remoteport access Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 08/36] scsi: hisi_sas: Remove set but not used variable 'dq_list' Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 09/36] scsi: NCR5380: Return false instead of NULL Sasha Levin
2018-11-22 21:49   ` Finn Thain
2018-11-23 11:27     ` Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 10/36] KVM: PPC: Move and undef TRACE_INCLUDE_PATH/FILE Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 11/36] cpufreq: imx6q: add return value check for voltage scale Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 12/36] rtc: cmos: Do not export alarm rtc_ops when we do not support alarms Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 13/36] rtc: pcf2127: fix a kmemleak caused in pcf2127_i2c_gather_write Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 14/36] crypto: simd - correctly take reqsize of wrapped skcipher into account Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 15/36] floppy: fix race condition in __floppy_read_block_0() Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 16/36] powerpc/io: Fix the IO workarounds code to work with Radix Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 17/36] sched/fair: Fix cpu_util_wake() for 'execl' type workloads Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 18/36] perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 19/36] ARM: make lookup_processor_type() non-__init Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 20/36] ARM: split out processor lookup Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 21/36] ARM: clean up per-processor check_bugs method call Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 22/36] ARM: add PROC_VTABLE and PROC_TABLE macros Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 23/36] ARM: spectre-v2: per-CPU vtables to work around big.Little systems Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 24/36] block: copy ioprio in __bio_clone_fast() and bounce Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 25/36] SUNRPC: Fix a bogus get/put in generic_key_to_expire() Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 26/36] riscv: add missing vdso_install target Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 27/36] RISC-V: Silence some module warnings on 32-bit Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 28/36] drm/amdgpu: fix bug with IH ring setup Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 29/36] kdb: Use strscpy with destination buffer size Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 30/36] NFSv4: Fix an Oops during delegation callbacks Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 31/36] powerpc/numa: Suppress "VPHN is not supported" messages Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 32/36] efi/arm: Revert deferred unmap of early memmap mapping Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 33/36] z3fold: fix possible reclaim races Sasha Levin
2018-11-22 19:52 ` Sasha Levin [this message]
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 35/36] tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset Sasha Levin
2018-11-22 19:52 ` [PATCH AUTOSEL 4.19 36/36] mm, page_alloc: check for max order in hot path Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181122195240.13123-34-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).