linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: "Pavel Tatashin" <pasha.tatashin@oracle.com>,
	"Abdul Haleem" <abdhalee@linux.vnet.ibm.com>,
	"Baoquan He" <bhe@redhat.com>,
	"Daniel Jordan" <daniel.m.jordan@oracle.com>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Dave Hansen" <dave.hansen@intel.com>,
	"David Rientjes" <rientjes@google.com>,
	"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Ingo Molnar" <mingo@kernel.org>, "Jan Kara" <jack@suse.cz>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	"Michal Hocko" <mhocko@suse.com>,
	"Souptick Joarder" <jrdr.linux@gmail.com>,
	"Steven Sistare" <steven.sistare@oracle.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Wei Yang" <richard.weiyang@gmail.com>,
	"Pasha Tatashin" <Pavel.Tatashin@microsoft.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"Sasha Levin" <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.19 03/57] mm: calculate deferred pages after skipping mirrored memory
Date: Sun,  4 Nov 2018 08:50:50 -0500	[thread overview]
Message-ID: <20181104135144.88324-3-sashal@kernel.org> (raw)
In-Reply-To: <20181104135144.88324-1-sashal@kernel.org>

From: Pavel Tatashin <pasha.tatashin@oracle.com>

[ Upstream commit d3035be4ce2345d98633a45f93a74e526e94b802 ]

update_defer_init() should be called only when struct page is about to be
initialized. Because it counts number of initialized struct pages, but
there we may skip struct pages if there is some mirrored memory.

So move, update_defer_init() after checking for mirrored memory.

Also, rename update_defer_init() to defer_init() and reverse the return
boolean to emphasize that this is a boolean function, that tells that the
reset of memmap initialization should be deferred.

Make this function self-contained: do not pass number of already
initialized pages in this zone by using static counters.

I found this bug by reading the code.  The effect is that fewer than
expected struct pages are initialized early in boot, and it is possible
that in some corner cases we may fail to boot when mirrored pages are
used.  The deferred on demand code should somewhat mitigate this.  But
this still brings some inconsistencies compared to when booting without
mirrored pages, so it is better to fix.

[pasha.tatashin@oracle.com: add comment about defer_init's lack of locking]
  Link: http://lkml.kernel.org/r/20180726193509.3326-3-pasha.tatashin@oracle.com
[akpm@linux-foundation.org: make defer_init non-inline, __meminit]
Link: http://lkml.kernel.org/r/20180724235520.10200-3-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Pasha Tatashin <Pavel.Tatashin@microsoft.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/page_alloc.c | 45 +++++++++++++++++++++++++--------------------
 1 file changed, 25 insertions(+), 20 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e2ef1c17942f..63f990b73750 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -306,24 +306,33 @@ static inline bool __meminit early_page_uninitialised(unsigned long pfn)
 }
 
 /*
- * Returns false when the remaining initialisation should be deferred until
+ * Returns true when the remaining initialisation should be deferred until
  * later in the boot cycle when it can be parallelised.
  */
-static inline bool update_defer_init(pg_data_t *pgdat,
-				unsigned long pfn, unsigned long zone_end,
-				unsigned long *nr_initialised)
+static bool __meminit
+defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 {
+	static unsigned long prev_end_pfn, nr_initialised;
+
+	/*
+	 * prev_end_pfn static that contains the end of previous zone
+	 * No need to protect because called very early in boot before smp_init.
+	 */
+	if (prev_end_pfn != end_pfn) {
+		prev_end_pfn = end_pfn;
+		nr_initialised = 0;
+	}
+
 	/* Always populate low zones for address-constrained allocations */
-	if (zone_end < pgdat_end_pfn(pgdat))
-		return true;
-	(*nr_initialised)++;
-	if ((*nr_initialised > pgdat->static_init_pgcnt) &&
-	    (pfn & (PAGES_PER_SECTION - 1)) == 0) {
-		pgdat->first_deferred_pfn = pfn;
+	if (end_pfn < pgdat_end_pfn(NODE_DATA(nid)))
 		return false;
+	nr_initialised++;
+	if ((nr_initialised > NODE_DATA(nid)->static_init_pgcnt) &&
+	    (pfn & (PAGES_PER_SECTION - 1)) == 0) {
+		NODE_DATA(nid)->first_deferred_pfn = pfn;
+		return true;
 	}
-
-	return true;
+	return false;
 }
 #else
 static inline bool early_page_uninitialised(unsigned long pfn)
@@ -331,11 +340,9 @@ static inline bool early_page_uninitialised(unsigned long pfn)
 	return false;
 }
 
-static inline bool update_defer_init(pg_data_t *pgdat,
-				unsigned long pfn, unsigned long zone_end,
-				unsigned long *nr_initialised)
+static inline bool defer_init(int nid, unsigned long pfn, unsigned long end_pfn)
 {
-	return true;
+	return false;
 }
 #endif
 
@@ -5459,9 +5466,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		struct vmem_altmap *altmap)
 {
 	unsigned long end_pfn = start_pfn + size;
-	pg_data_t *pgdat = NODE_DATA(nid);
 	unsigned long pfn;
-	unsigned long nr_initialised = 0;
 	struct page *page;
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 	struct memblock_region *r = NULL, *tmp;
@@ -5489,8 +5494,6 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			continue;
 		if (!early_pfn_in_nid(pfn, nid))
 			continue;
-		if (!update_defer_init(pgdat, pfn, end_pfn, &nr_initialised))
-			break;
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 		/*
@@ -5513,6 +5516,8 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 			}
 		}
 #endif
+		if (defer_init(nid, pfn, end_pfn))
+			break;
 
 not_early:
 		page = pfn_to_page(pfn);
-- 
2.17.1


  parent reply	other threads:[~2018-11-04 13:51 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-04 13:50 [PATCH AUTOSEL 4.19 01/57] mm: thp: fix MADV_DONTNEED vs migrate_misplaced_transhuge_page race condition Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 02/57] mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page() Sasha Levin
2018-11-04 13:50 ` Sasha Levin [this message]
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 04/57] mm: don't raise MEMCG_OOM event due to failed high-order allocation Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 05/57] mm/vmstat.c: assert that vmstat_text is in sync with stat_items_size Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 06/57] userfaultfd: allow get_mempolicy(MPOL_F_NODE|MPOL_F_ADDR) to trigger userfaults Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 07/57] mm: don't miss the last page because of round-off error Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 08/57] mm: don't warn about large allocations for slab Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 09/57] r8169: fix broken Wake-on-LAN from S5 (poweroff) Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 10/57] powerpc/traps: restore recoverability of machine_check interrupts Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 11/57] powerpc/64/module: REL32 relocation range check Sasha Levin
2018-11-04 13:50 ` [PATCH AUTOSEL 4.19 12/57] powerpc/mm: Fix page table dump to work on Radix Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 13/57] powerpc/mm: fix always true/false warning in slice.c Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 14/57] drm/amd/display: fix bug of accessing invalid memory Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 15/57] Input: wm97xx-ts - fix exit path Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 16/57] powerpc/Makefile: Fix PPC_BOOK3S_64 ASFLAGS Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 17/57] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log() Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 18/57] tty: check name length in tty_find_polling_driver() Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 19/57] tracing/kprobes: Check the probe on unloaded module correctly Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 20/57] drm/nouveau/secboot/acr: fix memory leak Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 21/57] drm/amdgpu/powerplay: fix missing break in switch statements Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 22/57] ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 23/57] powerpc/nohash: fix undefined behaviour when testing page size support Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 24/57] drm/msm/gpu: fix parameters in function msm_gpu_crashstate_capture Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 25/57] drm/msm/disp/dpu: Use proper define for drm_encoder_init() 'encoder_type' Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 26/57] drm/msm: dpu: Allow planes to extend past active display Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 27/57] powerpc/mm: Don't report hugepage tables as memory leaks when using kmemleak Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 28/57] watchdog: lantiq: update register names to better match spec Sasha Levin
2018-11-05 22:26   ` Hauke Mehrtens
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 29/57] drm/omap: fix memory barrier bug in DMM driver Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 30/57] iio: adc: at91: fix wrong channel number in triggered buffer mode Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 31/57] iio: adc: at91: fix acking DRDY irq on simple conversions Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 32/57] drm/amd/display: Raise dispclk value for dce120 by 15% Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 33/57] drm/amd/display: fix gamma not being applied Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 34/57] drm/hisilicon: hibmc: Do not carry error code in HiBMC framebuffer pointer Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 35/57] media: pci: cx23885: handle adding to list failure Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 36/57] media: coda: don't overwrite h.264 profile_idc on decoder instance Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 37/57] iio: adc: imx25-gcq: Fix leak of device_node in mx25_gcq_setup_cfgs() Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 38/57] MIPS: kexec: Mark CPU offline before disabling local IRQ Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 39/57] powerpc/boot: Ensure _zimage_start is a weak symbol Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 40/57] powerpc/memtrace: Remove memory in chunks Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 41/57] MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 42/57] staging: erofs: fix a missing endian conversion Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 43/57] serial: 8250_of: Fix for lack of interrupt support Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 44/57] sc16is7xx: Fix for multi-channel stall Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 45/57] media: tvp5150: fix width alignment during set_selection() Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 46/57] powerpc/selftests: Wait all threads to join Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 47/57] staging:iio:ad7606: fix voltage scales Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 48/57] drm: rcar-du: Update Gen3 output limitations Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 49/57] drm/amdgpu: Fix SDMA TO after GPU reset v3 Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 50/57] staging: most: video: fix registration of an empty comp core_component Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 51/57] 9p locks: fix glock.client_id leak in do_lock Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 52/57] udf: Prevent write-unsupported filesystem to be remounted read-write Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 53/57] ARM: dts: imx6ull: keep IMX6UL_ prefix for signals on both i.MX6UL and i.MX6ULL Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 54/57] media: ov5640: fix mode change regression Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 55/57] 9p: clear dangling pointers in p9stat_free Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 56/57] drm/amdgpu: fix integer overflow test in amdgpu_bo_list_create() Sasha Levin
2018-11-04 13:51 ` [PATCH AUTOSEL 4.19 57/57] media: ov5640: fix restore of last mode set Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181104135144.88324-3-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Pavel.Tatashin@microsoft.com \
    --cc=abdhalee@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.hansen@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jack@suse.cz \
    --cc=jglisse@redhat.com \
    --cc=jrdr.linux@gmail.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=pasha.tatashin@oracle.com \
    --cc=richard.weiyang@gmail.com \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=steven.sistare@oracle.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).