All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Bringmann <mwb@linux.vnet.ibm.com>
To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mwb@linux.vnet.ibm.com
Cc: Michael Ellerman <mpe@ellerman.id.au>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	Kees Cook <keescook@chromium.org>,
	Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>,
	Russell Currey <ruscur@russell.cc>,
	Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
	Christophe Leroy <christophe.leroy@c-s.fr>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>,
	Pavel Tatashin <pasha.tatashin@oracle.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Oscar Salvador <osalvador@suse.de>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Mathieu Malaterre <malat@debian.org>,
	Juliet Kim <minkim@us.ibm.com>,
	Tyrel Datwyler <tyreld@linux.vnet.ibm.com>,
	Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Subject: [PATCH] migration/mm: Add WARN_ON to try_offline_node
Date: Mon, 01 Oct 2018 13:56:25 -0500	[thread overview]
Message-ID: <20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com> (raw)

In some LPAR migration scenarios, device-tree modifications are
made to the affinity of the memory in the system.  For instance,
it may occur that memory is installed to nodes 0,3 on a source
system, and to nodes 0,2 on a target system.  Node 2 may not
have been initialized/allocated on the target system.

After migration, if a RTAS PRRN memory remove is made to a
memory block that was in node 3 on the source system, then
try_offline_node tries to remove it from node 2 on the target.
The NODE_DATA(2) block would not be initialized on the target,
and there is no validation check in the current code to prevent
the use of a NULL pointer.  Call traces such as the following
may be observed:

A similar problem of moving memory to an unitialized node has
also been observed on systems where multiple PRRN events occur
prior to a complete update of the device-tree.

pseries-hotplug-mem: Attempting to update LMB, drc index 80000002
Offlined Pages 4096
...
Oops: Kernel access of bad area, sig: 11 [#1]
...
Workqueue: pseries hotplug workque pseries_hp_work_fn
...
NIP [c0000000002bc088] try_offline_node+0x48/0x1e0
LR [c0000000002e0b84] remove_memory+0xb4/0xf0
Call Trace:
[c0000002bbee7a30] [c0000002bbee7a70] 0xc0000002bbee7a70 (unreliable)
[c0000002bbee7a70] [c0000000002e0b84] remove_memory+0xb4/0xf0
[c0000002bbee7ab0] [c000000000097784] dlpar_remove_lmb+0xb4/0x160
[c0000002bbee7af0] [c000000000097f38] dlpar_memory+0x328/0xcb0
[c0000002bbee7ba0] [c0000000000906d0] handle_dlpar_errorlog+0xc0/0x130
[c0000002bbee7c10] [c0000000000907d4] pseries_hp_work_fn+0x94/0xa0
[c0000002bbee7c40] [c0000000000e1cd0] process_one_work+0x1a0/0x4e0
[c0000002bbee7cd0] [c0000000000e21b0] worker_thread+0x1a0/0x610
[c0000002bbee7d80] [c0000000000ea458] kthread+0x128/0x150
[c0000002bbee7e30] [c00000000000982c] ret_from_kernel_thread+0x5c/0xb0

This patch adds a check for an incorrectly initialized to the
beginning of try_offline_node, and exits the routine.

Another patch is being developed for powerpc to track the
node Id to which an LMB belongs, so that we can remove the
LMB from there instead of the nid as currently interpreted
from the device tree.

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
---
 mm/memory_hotplug.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 38d94b7..e48a4d0 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1831,10 +1831,16 @@ static int check_and_unmap_cpu_on_node(pg_data_t *pgdat)
 void try_offline_node(int nid)
 {
 	pg_data_t *pgdat = NODE_DATA(nid);
-	unsigned long start_pfn = pgdat->node_start_pfn;
-	unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages;
+	unsigned long start_pfn;
+	unsigned long end_pfn;
 	unsigned long pfn;
 
+	if (WARN_ON(pgdat == NULL))
+		return;
+
+	start_pfn = pgdat->node_start_pfn;
+	end_pfn = start_pfn + pgdat->node_spanned_pages;
+
 	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
 		unsigned long section_nr = pfn_to_section_nr(pfn);
 


WARNING: multiple messages have this Message-ID (diff)
From: Michael Bringmann <mwb@linux.vnet.ibm.com>
To: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, mwb@linux.vnet.ibm.com
Cc: Michal Hocko <mhocko@suse.com>,
	Thomas Falcon <tlfalcon@linux.vnet.ibm.com>,
	Kees Cook <keescook@chromium.org>,
	Mathieu Malaterre <malat@debian.org>,
	Pavel Tatashin <pasha.tatashin@oracle.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
	Juliet Kim <minkim@us.ibm.com>,
	Tyrel Datwyler <tyreld@linux.vnet.ibm.com>,
	Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Oscar Salvador <osalvador@suse.de>
Subject: [PATCH] migration/mm: Add WARN_ON to try_offline_node
Date: Mon, 01 Oct 2018 13:56:25 -0500	[thread overview]
Message-ID: <20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com> (raw)

In some LPAR migration scenarios, device-tree modifications are
made to the affinity of the memory in the system.  For instance,
it may occur that memory is installed to nodes 0,3 on a source
system, and to nodes 0,2 on a target system.  Node 2 may not
have been initialized/allocated on the target system.

After migration, if a RTAS PRRN memory remove is made to a
memory block that was in node 3 on the source system, then
try_offline_node tries to remove it from node 2 on the target.
The NODE_DATA(2) block would not be initialized on the target,
and there is no validation check in the current code to prevent
the use of a NULL pointer.  Call traces such as the following
may be observed:

A similar problem of moving memory to an unitialized node has
also been observed on systems where multiple PRRN events occur
prior to a complete update of the device-tree.

pseries-hotplug-mem: Attempting to update LMB, drc index 80000002
Offlined Pages 4096
...
Oops: Kernel access of bad area, sig: 11 [#1]
...
Workqueue: pseries hotplug workque pseries_hp_work_fn
...
NIP [c0000000002bc088] try_offline_node+0x48/0x1e0
LR [c0000000002e0b84] remove_memory+0xb4/0xf0
Call Trace:
[c0000002bbee7a30] [c0000002bbee7a70] 0xc0000002bbee7a70 (unreliable)
[c0000002bbee7a70] [c0000000002e0b84] remove_memory+0xb4/0xf0
[c0000002bbee7ab0] [c000000000097784] dlpar_remove_lmb+0xb4/0x160
[c0000002bbee7af0] [c000000000097f38] dlpar_memory+0x328/0xcb0
[c0000002bbee7ba0] [c0000000000906d0] handle_dlpar_errorlog+0xc0/0x130
[c0000002bbee7c10] [c0000000000907d4] pseries_hp_work_fn+0x94/0xa0
[c0000002bbee7c40] [c0000000000e1cd0] process_one_work+0x1a0/0x4e0
[c0000002bbee7cd0] [c0000000000e21b0] worker_thread+0x1a0/0x610
[c0000002bbee7d80] [c0000000000ea458] kthread+0x128/0x150
[c0000002bbee7e30] [c00000000000982c] ret_from_kernel_thread+0x5c/0xb0

This patch adds a check for an incorrectly initialized to the
beginning of try_offline_node, and exits the routine.

Another patch is being developed for powerpc to track the
node Id to which an LMB belongs, so that we can remove the
LMB from there instead of the nid as currently interpreted
from the device tree.

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
---
 mm/memory_hotplug.c |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 38d94b7..e48a4d0 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1831,10 +1831,16 @@ static int check_and_unmap_cpu_on_node(pg_data_t *pgdat)
 void try_offline_node(int nid)
 {
 	pg_data_t *pgdat = NODE_DATA(nid);
-	unsigned long start_pfn = pgdat->node_start_pfn;
-	unsigned long end_pfn = start_pfn + pgdat->node_spanned_pages;
+	unsigned long start_pfn;
+	unsigned long end_pfn;
 	unsigned long pfn;
 
+	if (WARN_ON(pgdat == NULL))
+		return;
+
+	start_pfn = pgdat->node_start_pfn;
+	end_pfn = start_pfn + pgdat->node_spanned_pages;
+
 	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
 		unsigned long section_nr = pfn_to_section_nr(pfn);
 


             reply	other threads:[~2018-10-01 18:56 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-01 18:56 Michael Bringmann [this message]
2018-10-01 18:56 ` [PATCH] migration/mm: Add WARN_ON to try_offline_node Michael Bringmann
2018-10-01 20:02 ` Kees Cook
2018-10-01 20:02   ` Kees Cook
2018-10-01 20:27 ` Michal Hocko
2018-10-01 20:27   ` Michal Hocko
2018-10-01 23:20   ` Tyrel Datwyler
2018-10-01 23:20     ` Tyrel Datwyler
2018-10-02 14:51     ` Michael Bringmann
2018-10-02 14:51       ` Michael Bringmann
2018-10-02 14:59       ` Michal Hocko
2018-10-02 14:59         ` Michal Hocko
2018-10-02 15:14         ` Michael Bringmann
2018-10-02 15:14           ` Michael Bringmann
2018-10-02 16:04           ` Michal Hocko
2018-10-02 16:04             ` Michal Hocko
2018-10-02 18:13             ` Michael Bringmann
2018-10-02 18:13               ` Michael Bringmann
2018-10-02 19:45               ` Tyrel Datwyler
2018-10-02 19:45                 ` Tyrel Datwyler
2018-10-03  7:03                 ` Michal Hocko
2018-10-03  7:03                   ` Michal Hocko
2018-10-03 13:27                 ` Michael Bringmann
2018-10-03 13:27                   ` Michael Bringmann
2018-10-03 23:05                   ` Tyrel Datwyler
2018-10-03 23:05                     ` Tyrel Datwyler
2018-10-04  1:02                     ` Michael Bringmann
2018-10-04  1:02                       ` Michael Bringmann
2018-10-01 23:23   ` Tyrel Datwyler
2018-10-01 23:23     ` Tyrel Datwyler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181001185616.11427.35521.stgit@ltcalpine2-lp9.aus.stglabs.ibm.com \
    --to=mwb@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bauerman@linux.vnet.ibm.com \
    --cc=christophe.leroy@c-s.fr \
    --cc=dan.j.williams@intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=malat@debian.org \
    --cc=mauricfo@linux.vnet.ibm.com \
    --cc=mhocko@suse.com \
    --cc=minkim@us.ibm.com \
    --cc=mpe@ellerman.id.au \
    --cc=nfont@linux.vnet.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@oracle.com \
    --cc=ruscur@russell.cc \
    --cc=tlfalcon@linux.vnet.ibm.com \
    --cc=tyreld@linux.vnet.ibm.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.