All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@linux.ibm.com>, Baoquan He <bhe@redhat.com>,
	Chuhong Yuan <hslester96@gmail.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Kaly Xin <Kaly.Xin@arm.com>, Jia He <justin.he@arm.com>
Subject: [PATCH 2/3] mm/memory_hotplug: harden try_offline_node against bogus nid
Date: Mon,  6 Jul 2020 09:19:46 +0800	[thread overview]
Message-ID: <20200706011947.184166-3-justin.he@arm.com> (raw)
In-Reply-To: <20200706011947.184166-1-justin.he@arm.com>

When testing the remove_memory path of dax pmem, there will be a panic with
call trace:
  try_remove_memory+0x84/0x170
  remove_memory+0x38/0x58
  dev_dax_kmem_remove+0x3c/0x84 [kmem]
  device_release_driver_internal+0xfc/0x1c8
  device_release_driver+0x28/0x38
  bus_remove_device+0xd4/0x158
  device_del+0x160/0x3a0
  unregister_dev_dax+0x30/0x68
  devm_action_release+0x20/0x30
  release_nodes+0x150/0x240
  devres_release_all+0x6c/0x1d0
  device_release_driver_internal+0x10c/0x1c8
  driver_detach+0xac/0x170
  bus_remove_driver+0x64/0x130
  driver_unregister+0x34/0x60
  dax_pmem_exit+0x14/0xffc4 [dax_pmem]
  __arm64_sys_delete_module+0x18c/0x2d0
  el0_svc_common.constprop.2+0x78/0x168
  do_el0_svc+0x34/0xa0
  el0_sync_handler+0xe0/0x188
  el0_sync+0x164/0x180

It is caused by the bogus nid (-1). Although the root cause is pmem dax
translates from pxm to node_id incorrectly due to numa_off, it is worth
hardening the codes in try_offline_node(), quiting if !pgdat.

Signed-off-by: Jia He <justin.he@arm.com>
---
 mm/memory_hotplug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da374cd3d45b..e1e290577b45 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1680,6 +1680,9 @@ void try_offline_node(int nid)
 	pg_data_t *pgdat = NODE_DATA(nid);
 	int rc;
 
+	if (WARN_ON(!pgdat))
+		return;
+
 	/*
 	 * If the node still spans pages (especially ZONE_DEVICE), don't
 	 * offline it. A node spans memory after move_pfn_range_to_zone(),
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Jia He <justin.he@arm.com>
To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>
Cc: Jia He <justin.he@arm.com>, Baoquan He <bhe@redhat.com>,
	Kaly Xin <Kaly.Xin@arm.com>, Chuhong Yuan <hslester96@gmail.com>,
	linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-arm-kernel@lists.infradead.org
Subject: [PATCH 2/3] mm/memory_hotplug: harden try_offline_node against bogus nid
Date: Mon,  6 Jul 2020 09:19:46 +0800	[thread overview]
Message-ID: <20200706011947.184166-3-justin.he@arm.com> (raw)
In-Reply-To: <20200706011947.184166-1-justin.he@arm.com>

When testing the remove_memory path of dax pmem, there will be a panic with
call trace:
  try_remove_memory+0x84/0x170
  remove_memory+0x38/0x58
  dev_dax_kmem_remove+0x3c/0x84 [kmem]
  device_release_driver_internal+0xfc/0x1c8
  device_release_driver+0x28/0x38
  bus_remove_device+0xd4/0x158
  device_del+0x160/0x3a0
  unregister_dev_dax+0x30/0x68
  devm_action_release+0x20/0x30
  release_nodes+0x150/0x240
  devres_release_all+0x6c/0x1d0
  device_release_driver_internal+0x10c/0x1c8
  driver_detach+0xac/0x170
  bus_remove_driver+0x64/0x130
  driver_unregister+0x34/0x60
  dax_pmem_exit+0x14/0xffc4 [dax_pmem]
  __arm64_sys_delete_module+0x18c/0x2d0
  el0_svc_common.constprop.2+0x78/0x168
  do_el0_svc+0x34/0xa0
  el0_sync_handler+0xe0/0x188
  el0_sync+0x164/0x180

It is caused by the bogus nid (-1). Although the root cause is pmem dax
translates from pxm to node_id incorrectly due to numa_off, it is worth
hardening the codes in try_offline_node(), quiting if !pgdat.

Signed-off-by: Jia He <justin.he@arm.com>
---
 mm/memory_hotplug.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index da374cd3d45b..e1e290577b45 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1680,6 +1680,9 @@ void try_offline_node(int nid)
 	pg_data_t *pgdat = NODE_DATA(nid);
 	int rc;
 
+	if (WARN_ON(!pgdat))
+		return;
+
 	/*
 	 * If the node still spans pages (especially ZONE_DEVICE), don't
 	 * offline it. A node spans memory after move_pfn_range_to_zone(),
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2020-07-06  1:20 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06  1:19 [PATCH 0/3] Fix and enable pmem as RAM on arm64 Jia He
2020-07-06  1:19 ` Jia He
2020-07-06  1:19 ` [PATCH 1/3] arm64/numa: set numa_off to false when numa node is fake Jia He
2020-07-06  1:19   ` Jia He
2020-07-06  8:02   ` David Hildenbrand
2020-07-06  8:02     ` David Hildenbrand
2020-07-06 12:36     ` Justin He
2020-07-06 12:36       ` Justin He
2020-07-06 12:36       ` Justin He
2020-07-06 13:56       ` David Hildenbrand
2020-07-06 13:56         ` David Hildenbrand
2020-07-06 13:56         ` David Hildenbrand
2020-07-06 10:29   ` Jonathan Cameron
2020-07-06 10:29     ` Jonathan Cameron
2020-07-06 10:46     ` Jonathan Cameron
2020-07-06 10:46       ` Jonathan Cameron
2020-07-06 12:47       ` Justin He
2020-07-06 12:47         ` Justin He
2020-07-06 12:47         ` Justin He
2020-07-06 13:03         ` Jonathan Cameron
2020-07-06 13:03           ` Jonathan Cameron
2020-07-06 13:03           ` Jonathan Cameron
2020-07-06  1:19 ` Jia He [this message]
2020-07-06  1:19   ` [PATCH 2/3] mm/memory_hotplug: harden try_offline_node against bogus nid Jia He
2020-07-06  7:57   ` David Hildenbrand
2020-07-06  7:57     ` David Hildenbrand
2020-07-06 13:45     ` Justin He
2020-07-06 13:45       ` Justin He
2020-07-06 13:45       ` Justin He
2020-07-06  1:19 ` [PATCH 3/3] mm/memory_hotplug: fix unpaired mem_hotplug_begin/done Jia He
2020-07-06  1:19   ` Jia He
2020-07-06  7:49   ` David Hildenbrand
2020-07-06  7:49     ` David Hildenbrand
2020-07-07 22:10     ` Dan Williams
2020-07-07 22:10       ` Dan Williams
2020-07-07 22:10       ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200706011947.184166-3-justin.he@arm.com \
    --to=justin.he@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=hslester96@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rppt@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.