linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] mm, memory_hotplug: fix uninitialized pages fallouts.
@ 2019-01-30  9:12 Michal Hocko
  2019-01-30  9:12 ` [PATCH v2 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone Michal Hocko
  2019-01-30  9:12 ` [PATCH v2 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone Michal Hocko
  0 siblings, 2 replies; 3+ messages in thread
From: Michal Hocko @ 2019-01-30  9:12 UTC (permalink / raw)
  To: Mikhail Zaslonko, Mikhail Gavrilov
  Cc: Andrew Morton, Pavel Tatashin, schwidefsky, heiko.carstens,
	gerald.schaefer, linux-mm, LKML

Hi,
this is the second version of the series. v1 was posted [1]. There are
no functional changes since v1. I have just fixed up the changelog of
patch 1 which had a wrong trace (c&p mistake). I have also added
tested-bys and reviewed-bys.

Mikhail has posted fixes for the two bugs quite some time ago [2]. I
have pushed back on those fixes because I believed that it is much
better to plug the problem at the initialization time rather than play
whack-a-mole all over the hotplug code and find all the places which
expect the full memory section to be initialized. We have ended up with
2830bf6f05fb ("mm, memory_hotplug: initialize struct pages for the full
memory section") merged and cause a regression [3][4]. The reason is
that there might be memory layouts when two NUMA nodes share the same
memory section so the merged fix is simply incorrect.

In order to plug this hole we really have to be zone range aware in
those handlers. I have split up the original patch into two. One is
unchanged (patch 2) and I took a different approach for `removable'
crash. It would be great if Mikhail could test it still works for his
memory layout.

[1] http://lkml.kernel.org/r/20190128144506.15603-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
[4] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v2 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone
  2019-01-30  9:12 [PATCH v2 0/2] mm, memory_hotplug: fix uninitialized pages fallouts Michal Hocko
@ 2019-01-30  9:12 ` Michal Hocko
  2019-01-30  9:12 ` [PATCH v2 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone Michal Hocko
  1 sibling, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2019-01-30  9:12 UTC (permalink / raw)
  To: Mikhail Zaslonko, Mikhail Gavrilov
  Cc: Andrew Morton, Pavel Tatashin, schwidefsky, heiko.carstens,
	gerald.schaefer, linux-mm, LKML, Michal Hocko, Oscar Salvador

From: Michal Hocko <mhocko@suse.com>

Mikhail has reported the following VM_BUG_ON triggered when reading
sysfs removable state of a memory block:
 page:000003d08300c000 is uninitialized and poisoned
 page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
 Call Trace:
 ([<000000000038596c>] is_mem_section_removable+0xb4/0x190)
  [<00000000008f12fa>] show_mem_removable+0x9a/0xd8
  [<00000000008cf9c4>] dev_attr_show+0x34/0x70
  [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
  [<00000000003e4194>] seq_read+0x204/0x480
  [<00000000003b53ea>] __vfs_read+0x32/0x178
  [<00000000003b55b2>] vfs_read+0x82/0x138
  [<00000000003b5be2>] ksys_read+0x5a/0xb0
  [<0000000000b86ba0>] system_call+0xdc/0x2d8
 Last Breaking-Event-Address:
  [<000000000038596c>] is_mem_section_removable+0xb4/0x190
 Kernel panic - not syncing: Fatal exception: panic_on_oops

The reason is that the memory block spans the zone boundary and we are
stumbling over an unitialized struct page. Fix this by enforcing zone
range in is_mem_section_removable so that we never run away from a
zone.

Reported-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Debugged-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b9a667d36c55..07872789d778 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1233,7 +1233,8 @@ static bool is_pageblock_removable_nolock(struct page *page)
 bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
 {
 	struct page *page = pfn_to_page(start_pfn);
-	struct page *end_page = page + nr_pages;
+	unsigned long end_pfn = min(start_pfn + nr_pages, zone_end_pfn(page_zone(page)));
+	struct page *end_page = pfn_to_page(end_pfn);
 
 	/* Check the starting page of each pageblock within the range */
 	for (; page < end_page; page = next_active_pageblock(page)) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v2 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone
  2019-01-30  9:12 [PATCH v2 0/2] mm, memory_hotplug: fix uninitialized pages fallouts Michal Hocko
  2019-01-30  9:12 ` [PATCH v2 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone Michal Hocko
@ 2019-01-30  9:12 ` Michal Hocko
  1 sibling, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2019-01-30  9:12 UTC (permalink / raw)
  To: Mikhail Zaslonko, Mikhail Gavrilov
  Cc: Andrew Morton, Pavel Tatashin, schwidefsky, heiko.carstens,
	gerald.schaefer, linux-mm, LKML, Oscar Salvador, Michal Hocko

From: Mikhail Zaslonko <zaslonko@linux.ibm.com>

If memory end is not aligned with the sparse memory section boundary, the
mapping of such a section is only partly initialized. This may lead to
VM_BUG_ON due to uninitialized struct pages access from test_pages_in_a_zone()
function triggered by memory_hotplug sysfs handlers.

Here are the the panic examples:
 CONFIG_DEBUG_VM_PGFLAGS=y
 kernel parameter mem=2050M
 --------------------------
 page:000003d082008000 is uninitialized and poisoned
 page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
 Call Trace:
 ([<0000000000385b26>] test_pages_in_a_zone+0xde/0x160)
  [<00000000008f15c4>] show_valid_zones+0x5c/0x190
  [<00000000008cf9c4>] dev_attr_show+0x34/0x70
  [<0000000000463ad0>] sysfs_kf_seq_show+0xc8/0x148
  [<00000000003e4194>] seq_read+0x204/0x480
  [<00000000003b53ea>] __vfs_read+0x32/0x178
  [<00000000003b55b2>] vfs_read+0x82/0x138
  [<00000000003b5be2>] ksys_read+0x5a/0xb0
  [<0000000000b86ba0>] system_call+0xdc/0x2d8
 Last Breaking-Event-Address:
  [<0000000000385b26>] test_pages_in_a_zone+0xde/0x160
 Kernel panic - not syncing: Fatal exception: panic_on_oops

Fix this by checking whether the pfn to check is within the zone.

[mhocko@suse.com: separated this change from
http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com]
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 07872789d778..7711d0e327b6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1274,6 +1274,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn,
 				i++;
 			if (i == MAX_ORDER_NR_PAGES || pfn + i >= end_pfn)
 				continue;
+			/* Check if we got outside of the zone */
+			if (zone && !zone_spans_pfn(zone, pfn + i))
+				return 0;
 			page = pfn_to_page(pfn + i);
 			if (zone && page_zone(page) != zone)
 				return 0;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-01-30  9:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-30  9:12 [PATCH v2 0/2] mm, memory_hotplug: fix uninitialized pages fallouts Michal Hocko
2019-01-30  9:12 ` [PATCH v2 1/2] mm, memory_hotplug: is_mem_section_removable do not pass the end of a zone Michal Hocko
2019-01-30  9:12 ` [PATCH v2 2/2] mm, memory_hotplug: test_pages_in_a_zone do not pass the end of zone Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).