All of lore.kernel.org
 help / color / mirror / Atom feed
From: Justin He <Justin.He@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mike Rapoport <rppt@linux.ibm.com>, Baoquan He <bhe@redhat.com>,
	Chuhong Yuan <hslester96@gmail.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Kaly Xin <Kaly.Xin@arm.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <will@kernel.org>
Subject: RE: [PATCH 2/3] mm/memory_hotplug: harden try_offline_node against bogus nid
Date: Mon, 6 Jul 2020 13:45:13 +0000	[thread overview]
Message-ID: <AM6PR08MB40697FCA7F2374EBE6459FE4F7690@AM6PR08MB4069.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <4b864877-1147-8336-5e9a-e89ac5c99be3@redhat.com>

Hi David

> -----Original Message-----
> From: David Hildenbrand <david@redhat.com>
> Sent: Monday, July 6, 2020 3:58 PM
> To: Justin He <Justin.He@arm.com>; Catalin Marinas
> <Catalin.Marinas@arm.com>; Will Deacon <will@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>; Mike Rapoport
> <rppt@linux.ibm.com>; Baoquan He <bhe@redhat.com>; Chuhong Yuan
> <hslester96@gmail.com>; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; linux-mm@kvack.org; Kaly Xin <Kaly.Xin@arm.com>
> Subject: Re: [PATCH 2/3] mm/memory_hotplug: harden try_offline_node
> against bogus nid
> 
> On 06.07.20 03:19, Jia He wrote:
> > When testing the remove_memory path of dax pmem, there will be a panic
> with
> > call trace:
> >   try_remove_memory+0x84/0x170
> >   remove_memory+0x38/0x58
> >   dev_dax_kmem_remove+0x3c/0x84 [kmem]
> >   device_release_driver_internal+0xfc/0x1c8
> >   device_release_driver+0x28/0x38
> >   bus_remove_device+0xd4/0x158
> >   device_del+0x160/0x3a0
> >   unregister_dev_dax+0x30/0x68
> >   devm_action_release+0x20/0x30
> >   release_nodes+0x150/0x240
> >   devres_release_all+0x6c/0x1d0
> >   device_release_driver_internal+0x10c/0x1c8
> >   driver_detach+0xac/0x170
> >   bus_remove_driver+0x64/0x130
> >   driver_unregister+0x34/0x60
> >   dax_pmem_exit+0x14/0xffc4 [dax_pmem]
> >   __arm64_sys_delete_module+0x18c/0x2d0
> >   el0_svc_common.constprop.2+0x78/0x168
> >   do_el0_svc+0x34/0xa0
> >   el0_sync_handler+0xe0/0x188
> >   el0_sync+0x164/0x180
> >
> > It is caused by the bogus nid (-1). Although the root cause is pmem dax
> > translates from pxm to node_id incorrectly due to numa_off, it is worth
> > hardening the codes in try_offline_node(), quiting if !pgdat.
> >
> > Signed-off-by: Jia He <justin.he@arm.com>
> > ---
> >  mm/memory_hotplug.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index da374cd3d45b..e1e290577b45 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1680,6 +1680,9 @@ void try_offline_node(int nid)
> >  	pg_data_t *pgdat = NODE_DATA(nid);
> >  	int rc;
> >
> > +	if (WARN_ON(!pgdat))
> > +		return;
> > +
> >  	/*
> >  	 * If the node still spans pages (especially ZONE_DEVICE), don't
> >  	 * offline it. A node spans memory after move_pfn_range_to_zone(),
> >
> 
> Hm. If I am not wrong, somebody used add_memory() with another nid than
> try_remove_memory()?
> 

Yes after commit fa6d9ec790550, it can prevent this possibility.
I will drop this single patch. Thanks
--
Cheers,
Justin (Jia He)



WARNING: multiple messages have this Message-ID (diff)
From: Justin He <Justin.He@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>, Kaly Xin <Kaly.Xin@arm.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Chuhong Yuan <hslester96@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Mike Rapoport <rppt@linux.ibm.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: RE: [PATCH 2/3] mm/memory_hotplug: harden try_offline_node against bogus nid
Date: Mon, 6 Jul 2020 13:45:13 +0000	[thread overview]
Message-ID: <AM6PR08MB40697FCA7F2374EBE6459FE4F7690@AM6PR08MB4069.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <4b864877-1147-8336-5e9a-e89ac5c99be3@redhat.com>

Hi David

> -----Original Message-----
> From: David Hildenbrand <david@redhat.com>
> Sent: Monday, July 6, 2020 3:58 PM
> To: Justin He <Justin.He@arm.com>; Catalin Marinas
> <Catalin.Marinas@arm.com>; Will Deacon <will@kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>; Mike Rapoport
> <rppt@linux.ibm.com>; Baoquan He <bhe@redhat.com>; Chuhong Yuan
> <hslester96@gmail.com>; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; linux-mm@kvack.org; Kaly Xin <Kaly.Xin@arm.com>
> Subject: Re: [PATCH 2/3] mm/memory_hotplug: harden try_offline_node
> against bogus nid
> 
> On 06.07.20 03:19, Jia He wrote:
> > When testing the remove_memory path of dax pmem, there will be a panic
> with
> > call trace:
> >   try_remove_memory+0x84/0x170
> >   remove_memory+0x38/0x58
> >   dev_dax_kmem_remove+0x3c/0x84 [kmem]
> >   device_release_driver_internal+0xfc/0x1c8
> >   device_release_driver+0x28/0x38
> >   bus_remove_device+0xd4/0x158
> >   device_del+0x160/0x3a0
> >   unregister_dev_dax+0x30/0x68
> >   devm_action_release+0x20/0x30
> >   release_nodes+0x150/0x240
> >   devres_release_all+0x6c/0x1d0
> >   device_release_driver_internal+0x10c/0x1c8
> >   driver_detach+0xac/0x170
> >   bus_remove_driver+0x64/0x130
> >   driver_unregister+0x34/0x60
> >   dax_pmem_exit+0x14/0xffc4 [dax_pmem]
> >   __arm64_sys_delete_module+0x18c/0x2d0
> >   el0_svc_common.constprop.2+0x78/0x168
> >   do_el0_svc+0x34/0xa0
> >   el0_sync_handler+0xe0/0x188
> >   el0_sync+0x164/0x180
> >
> > It is caused by the bogus nid (-1). Although the root cause is pmem dax
> > translates from pxm to node_id incorrectly due to numa_off, it is worth
> > hardening the codes in try_offline_node(), quiting if !pgdat.
> >
> > Signed-off-by: Jia He <justin.he@arm.com>
> > ---
> >  mm/memory_hotplug.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index da374cd3d45b..e1e290577b45 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1680,6 +1680,9 @@ void try_offline_node(int nid)
> >  	pg_data_t *pgdat = NODE_DATA(nid);
> >  	int rc;
> >
> > +	if (WARN_ON(!pgdat))
> > +		return;
> > +
> >  	/*
> >  	 * If the node still spans pages (especially ZONE_DEVICE), don't
> >  	 * offline it. A node spans memory after move_pfn_range_to_zone(),
> >
> 
> Hm. If I am not wrong, somebody used add_memory() with another nid than
> try_remove_memory()?
> 

Yes after commit fa6d9ec790550, it can prevent this possibility.
I will drop this single patch. Thanks
--
Cheers,
Justin (Jia He)


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-07-06 13:45 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06  1:19 [PATCH 0/3] Fix and enable pmem as RAM on arm64 Jia He
2020-07-06  1:19 ` Jia He
2020-07-06  1:19 ` [PATCH 1/3] arm64/numa: set numa_off to false when numa node is fake Jia He
2020-07-06  1:19   ` Jia He
2020-07-06  8:02   ` David Hildenbrand
2020-07-06  8:02     ` David Hildenbrand
2020-07-06 12:36     ` Justin He
2020-07-06 12:36       ` Justin He
2020-07-06 12:36       ` Justin He
2020-07-06 13:56       ` David Hildenbrand
2020-07-06 13:56         ` David Hildenbrand
2020-07-06 13:56         ` David Hildenbrand
2020-07-06 10:29   ` Jonathan Cameron
2020-07-06 10:29     ` Jonathan Cameron
2020-07-06 10:46     ` Jonathan Cameron
2020-07-06 10:46       ` Jonathan Cameron
2020-07-06 12:47       ` Justin He
2020-07-06 12:47         ` Justin He
2020-07-06 12:47         ` Justin He
2020-07-06 13:03         ` Jonathan Cameron
2020-07-06 13:03           ` Jonathan Cameron
2020-07-06 13:03           ` Jonathan Cameron
2020-07-06  1:19 ` [PATCH 2/3] mm/memory_hotplug: harden try_offline_node against bogus nid Jia He
2020-07-06  1:19   ` Jia He
2020-07-06  7:57   ` David Hildenbrand
2020-07-06  7:57     ` David Hildenbrand
2020-07-06 13:45     ` Justin He [this message]
2020-07-06 13:45       ` Justin He
2020-07-06 13:45       ` Justin He
2020-07-06  1:19 ` [PATCH 3/3] mm/memory_hotplug: fix unpaired mem_hotplug_begin/done Jia He
2020-07-06  1:19   ` Jia He
2020-07-06  7:49   ` David Hildenbrand
2020-07-06  7:49     ` David Hildenbrand
2020-07-07 22:10     ` Dan Williams
2020-07-07 22:10       ` Dan Williams
2020-07-07 22:10       ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM6PR08MB40697FCA7F2374EBE6459FE4F7690@AM6PR08MB4069.eurprd08.prod.outlook.com \
    --to=justin.he@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Kaly.Xin@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=david@redhat.com \
    --cc=hslester96@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rppt@linux.ibm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.