linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
       [not found] <20211108205031.UxDPHBZWa%akpm@linux-foundation.org>
@ 2021-11-09  8:37 ` Michal Hocko
  2021-11-09  8:42   ` David Hildenbrand
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2021-11-09  8:37 UTC (permalink / raw)
  To: linux-kernel
  Cc: amakhalov, cl, david, dennis, mm-commits, osalvador, stable, tj

I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@dhcp22.suse.cz
There was no response to that feedback. I will not go as far as to nack
it explicitly because pcp allocator is not an area I would nack patches
but seriously, this issue needs a deeper look rather than a paper over
patch. I hope we do not want to do a similar thing to all callers of
cpu_to_mem.

On Mon 08-11-21 12:50:31, Andrew Morton wrote:
> 
> The patch titled
>      Subject: mm: fix panic in __alloc_pages
> has been added to the -mm tree.  Its filename is
>      mm-fix-panic-in-__alloc_pages.patch
> 
> This patch should soon appear at
>     https://ozlabs.org/~akpm/mmots/broken-out/mm-fix-panic-in-__alloc_pages.patch
> and later at
>     https://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-panic-in-__alloc_pages.patch
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> ------------------------------------------------------
> From: Alexey Makhalov <amakhalov@vmware.com>
> Subject: mm: fix panic in __alloc_pages
> 
> There is a kernel panic caused by pcpu_alloc_pages() passing offlined and
> uninitialized node to alloc_pages_node() leading to panic by NULL
> dereferencing uninitialized NODE_DATA(nid).
> 
>  CPU2 has been hot-added
>  BUG: unable to handle page fault for address: 0000000000001608
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: 0000 [#1] SMP PTI
>  CPU: 0 PID: 1 Comm: systemd Tainted: G            E     5.15.0-rc7+ #11
>  Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW
> 
>  RIP: 0010:__alloc_pages+0x127/0x290
>  Code: 4c 89 f0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 44 89 e0 48 8b 55 b8 c1 e8 0c 83 e0 01 88 45 d0 4c 89 c8 48 85 d2 0f 85 1a 01 00 00 <45> 3b 41 08 0f 82 10 01 00 00 48 89 45 c0 48 8b 00 44 89 e2 81 e2
>  RSP: 0018:ffffc900006f3bc8 EFLAGS: 00010246
>  RAX: 0000000000001600 RBX: 0000000000000000 RCX: 0000000000000000
>  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000cc2
>  RBP: ffffc900006f3c18 R08: 0000000000000001 R09: 0000000000001600
>  R10: ffffc900006f3a40 R11: ffff88813c9fffe8 R12: 0000000000000cc2
>  R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000cc2
>  FS:  00007f27ead70500(0000) GS:ffff88807ce00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000001608 CR3: 000000000582c003 CR4: 00000000001706b0
>  Call Trace:
>   pcpu_alloc_pages.constprop.0+0xe4/0x1c0
>   pcpu_populate_chunk+0x33/0xb0
>   pcpu_alloc+0x4d3/0x6f0
>   __alloc_percpu_gfp+0xd/0x10
>   alloc_mem_cgroup_per_node_info+0x54/0xb0
>   mem_cgroup_alloc+0xed/0x2f0
>   mem_cgroup_css_alloc+0x33/0x2f0
>   css_create+0x3a/0x1f0
>   cgroup_apply_control_enable+0x12b/0x150
>   cgroup_mkdir+0xdd/0x110
>   kernfs_iop_mkdir+0x4f/0x80
>   vfs_mkdir+0x178/0x230
>   do_mkdirat+0xfd/0x120
>   __x64_sys_mkdir+0x47/0x70
>   ? syscall_exit_to_user_mode+0x21/0x50
>   do_syscall_64+0x43/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Panic can be easily reproduced by disabling udev rule for automatic
> onlining hot added CPU followed by CPU with memoryless node (NUMA node
> with CPU only) hot add.
> 
> Hot adding CPU and memoryless node does not bring the node to online
> state.  Memoryless node will be onlined only during the onlining its CPU.
> 
> Node can be in one of the following states:
> 1. not present.(nid == NUMA_NO_NODE)
> 2. present, but offline (nid > NUMA_NO_NODE, node_online(nid) == 0,
> 				NODE_DATA(nid) == NULL)
> 3. present and online (nid > NUMA_NO_NODE, node_online(nid) > 0,
> 				NODE_DATA(nid) != NULL)
> 
> Percpu code is doing allocations for all possible CPUs.  The issue happens
> when it serves hot added but not yet onlined CPU when its node is in 2nd
> state.  This node is not ready to use, fallback to numa_mem_id().
> 
> Link: https://lkml.kernel.org/r/20211108202325.20304-1-amakhalov@vmware.com
> Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Dennis Zhou <dennis@kernel.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
> 
>  mm/percpu-vm.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> --- a/mm/percpu-vm.c~mm-fix-panic-in-__alloc_pages
> +++ a/mm/percpu-vm.c
> @@ -84,15 +84,19 @@ static int pcpu_alloc_pages(struct pcpu_
>  			    gfp_t gfp)
>  {
>  	unsigned int cpu, tcpu;
> -	int i;
> +	int i, nid;
>  
>  	gfp |= __GFP_HIGHMEM;
>  
>  	for_each_possible_cpu(cpu) {
> +		nid = cpu_to_node(cpu);
> +		if (nid == NUMA_NO_NODE || !node_online(nid))
> +			nid = numa_mem_id();
> +
>  		for (i = page_start; i < page_end; i++) {
>  			struct page **pagep = &pages[pcpu_page_idx(cpu, i)];
>  
> -			*pagep = alloc_pages_node(cpu_to_node(cpu), gfp, 0);
> +			*pagep = alloc_pages_node(nid, gfp, 0);
>  			if (!*pagep)
>  				goto err;
>  		}
> _
> 
> Patches currently in -mm which might be from amakhalov@vmware.com are
> 
> mm-fix-panic-in-__alloc_pages.patch

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-09  8:37 ` + mm-fix-panic-in-__alloc_pages.patch added to -mm tree Michal Hocko
@ 2021-11-09  8:42   ` David Hildenbrand
  2021-11-09 11:00     ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: David Hildenbrand @ 2021-11-09  8:42 UTC (permalink / raw)
  To: Michal Hocko, linux-kernel
  Cc: amakhalov, cl, dennis, mm-commits, osalvador, stable, tj

On 09.11.21 09:37, Michal Hocko wrote:
> I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@dhcp22.suse.cz
> There was no response to that feedback. I will not go as far as to nack
> it explicitly because pcp allocator is not an area I would nack patches
> but seriously, this issue needs a deeper look rather than a paper over
> patch. I hope we do not want to do a similar thing to all callers of
> cpu_to_mem.

While we could move it into the !HOLES version of cpu_to_mem(), calling
cpu_to_mem() on an offline (and eventually not even present) CPU (with
an offline node) is really a corner case.

Instead of additional runtime overhead for all cpu_to_mem(), my take
would be to just do it for the random special cases. Sure, we can
document that people should be careful when calling cpu_to_mem() on
offline CPUs. But IMHO it's really a corner case.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-09  8:42   ` David Hildenbrand
@ 2021-11-09 11:00     ` Michal Hocko
  2021-11-12 18:20       ` Dennis Zhou
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2021-11-09 11:00 UTC (permalink / raw)
  To: linux-kernel; +Cc: amakhalov, cl, dennis, mm-commits, osalvador, stable, tj

On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> On 09.11.21 09:37, Michal Hocko wrote:
> > I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@dhcp22.suse.cz
> > There was no response to that feedback. I will not go as far as to nack
> > it explicitly because pcp allocator is not an area I would nack patches
> > but seriously, this issue needs a deeper look rather than a paper over
> > patch. I hope we do not want to do a similar thing to all callers of
> > cpu_to_mem.
> 
> While we could move it into the !HOLES version of cpu_to_mem(), calling
> cpu_to_mem() on an offline (and eventually not even present) CPU (with
> an offline node) is really a corner case.
> 
> Instead of additional runtime overhead for all cpu_to_mem(), my take
> would be to just do it for the random special cases. Sure, we can
> document that people should be careful when calling cpu_to_mem() on
> offline CPUs. But IMHO it's really a corner case.

I suspect I haven't made myself clear enough. I do not think we should
be touching cpu_to_mem/cpu_to_node and handle this corner case. We
should be looking at the underlying problem instead. We cannot really
rely on cpu to be onlined to have a proper node association. We should
really look at the initialization code and handle this situation
properly. Memory less nodes are something we have been dealing with
already. This particular instance of the problem is new and we should
understand why.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-09 11:00     ` Michal Hocko
@ 2021-11-12 18:20       ` Dennis Zhou
  2021-11-15 10:41         ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Dennis Zhou @ 2021-11-12 18:20 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: linux-kernel, amakhalov, cl, mm-commits, osalvador, stable, tj

Hello,

On Tue, Nov 09, 2021 at 12:00:46PM +0100, Michal Hocko wrote:
> On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> > On 09.11.21 09:37, Michal Hocko wrote:
> > > I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@dhcp22.suse.cz
> > > There was no response to that feedback. I will not go as far as to nack
> > > it explicitly because pcp allocator is not an area I would nack patches
> > > but seriously, this issue needs a deeper look rather than a paper over
> > > patch. I hope we do not want to do a similar thing to all callers of
> > > cpu_to_mem.
> > 
> > While we could move it into the !HOLES version of cpu_to_mem(), calling
> > cpu_to_mem() on an offline (and eventually not even present) CPU (with
> > an offline node) is really a corner case.
> > 
> > Instead of additional runtime overhead for all cpu_to_mem(), my take
> > would be to just do it for the random special cases. Sure, we can
> > document that people should be careful when calling cpu_to_mem() on
> > offline CPUs. But IMHO it's really a corner case.
> 
> I suspect I haven't made myself clear enough. I do not think we should
> be touching cpu_to_mem/cpu_to_node and handle this corner case. We
> should be looking at the underlying problem instead. We cannot really
> rely on cpu to be onlined to have a proper node association. We should
> really look at the initialization code and handle this situation
> properly. Memory less nodes are something we have been dealing with
> already. This particular instance of the problem is new and we should
> understand why.
> -- 
> Michal Hocko
> SUSE Labs

So I think we're still short a solution here. This patch solves the side
effect but not the underlying problem related to cpu hotplug.

I'm fine with this going in as a stop gap because I imagine the fixes to
hotplug are a lot more intrusive, but do we have someone who can own
that work to fix hotplug? I think that should be a requirement for
taking this because clearly it's hotplug that's broken and not percpu.

Acked-by: Dennis Zhou <dennis@kernel.org>

Thanks,
Dennis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-12 18:20       ` Dennis Zhou
@ 2021-11-15 10:41         ` Michal Hocko
  2021-11-15 11:04           ` Alexey Makhalov
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2021-11-15 10:41 UTC (permalink / raw)
  To: Dennis Zhou
  Cc: Andrew Morton, linux-kernel, amakhalov, cl, mm-commits,
	osalvador, stable, tj

On Fri 12-11-21 13:20:20, Dennis Zhou wrote:
> Hello,
> 
> On Tue, Nov 09, 2021 at 12:00:46PM +0100, Michal Hocko wrote:
> > On Tue 09-11-21 09:42:56, David Hildenbrand wrote:
> > > On 09.11.21 09:37, Michal Hocko wrote:
> > > > I have opposed this patch http://lkml.kernel.org/r/YYj91Mkt4m8ySIWt@dhcp22.suse.cz
> > > > There was no response to that feedback. I will not go as far as to nack
> > > > it explicitly because pcp allocator is not an area I would nack patches
> > > > but seriously, this issue needs a deeper look rather than a paper over
> > > > patch. I hope we do not want to do a similar thing to all callers of
> > > > cpu_to_mem.
> > > 
> > > While we could move it into the !HOLES version of cpu_to_mem(), calling
> > > cpu_to_mem() on an offline (and eventually not even present) CPU (with
> > > an offline node) is really a corner case.
> > > 
> > > Instead of additional runtime overhead for all cpu_to_mem(), my take
> > > would be to just do it for the random special cases. Sure, we can
> > > document that people should be careful when calling cpu_to_mem() on
> > > offline CPUs. But IMHO it's really a corner case.
> > 
> > I suspect I haven't made myself clear enough. I do not think we should
> > be touching cpu_to_mem/cpu_to_node and handle this corner case. We
> > should be looking at the underlying problem instead. We cannot really
> > rely on cpu to be onlined to have a proper node association. We should
> > really look at the initialization code and handle this situation
> > properly. Memory less nodes are something we have been dealing with
> > already. This particular instance of the problem is new and we should
> > understand why.
> > -- 
> > Michal Hocko
> > SUSE Labs
> 
> So I think we're still short a solution here. This patch solves the side
> effect but not the underlying problem related to cpu hotplug.
> 
> I'm fine with this going in as a stop gap because I imagine the fixes to
> hotplug are a lot more intrusive, but do we have someone who can own
> that work to fix hotplug? I think that should be a requirement for
> taking this because clearly it's hotplug that's broken and not percpu.

I have asked several times for details about the specific setup that has
led to the reported crash. Without much success so far. Reproduction
steps would be the first step. That would allow somebody to work on this
at least if Alexey doesn't have time to dive into this deeper.

I would be more inclined to a stop gap workaround if this was a more
wide spread problem but a lack of other repports suggests this has been
a one off.

The final saying is yours of course.
 
> Acked-by: Dennis Zhou <dennis@kernel.org>
> 
> Thanks,
> Dennis

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-15 10:41         ` Michal Hocko
@ 2021-11-15 11:04           ` Alexey Makhalov
  2021-11-15 12:58             ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Alexey Makhalov @ 2021-11-15 11:04 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dennis Zhou, Andrew Morton, linux-kernel, cl, mm-commits,
	osalvador, stable, tj

[-- Attachment #1: Type: text/plain, Size: 1000 bytes --]

Hi Michal,

> 
> I have asked several times for details about the specific setup that has
> led to the reported crash. Without much success so far. Reproduction
> steps would be the first step. That would allow somebody to work on this
> at least if Alexey doesn't have time to dive into this deeper.
> 

I didn’t know that repro steps are still not clear.

To reproduce the panic you need to have a system, where you can hot add
the CPU that belongs to memoryless NUMA node which is not present and onlined
yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
at the same time.
I’m using VMware hypervisor and linux VM there configured in a way
that every (possible) CPU has its own NUMA node.
Before doing CPU hot add, udev rule for CPU onlining should be disabled.
After CPU hot add event, panic will be triggered shortly right on the next
percpu allocation.

Let me know if this is enough or you need some extra information.

Thanks,
—Alexey

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-15 11:04           ` Alexey Makhalov
@ 2021-11-15 12:58             ` Michal Hocko
  2021-11-15 23:11               ` Alexey Makhalov
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2021-11-15 12:58 UTC (permalink / raw)
  To: Alexey Makhalov
  Cc: Dennis Zhou, Andrew Morton, linux-kernel, cl, mm-commits,
	osalvador, stable, tj

On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
> Hi Michal,
> 
> > 
> > I have asked several times for details about the specific setup that has
> > led to the reported crash. Without much success so far. Reproduction
> > steps would be the first step. That would allow somebody to work on this
> > at least if Alexey doesn't have time to dive into this deeper.
> > 
> 
> I didn’t know that repro steps are still not clear.
> 
> To reproduce the panic you need to have a system, where you can hot add
> the CPU that belongs to memoryless NUMA node which is not present and onlined
> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
> at the same time.

There seems to be something different in your setup because memory less
nodes have reportedly worked on x86. I suspect something must be
different in your setup. Maybe it is that you are adding a cpu that is
outside of possible cpus intialized during boot time. Those should have
their nodes initialized properly - at least per init_cpu_to_node. Your
report doesn't really explain how the cpu is hotadded. Maybe you are
trying to do something that has never been supported on x86.

It would be really great if you can provide more information in the
original email thread. E.g. boot time messges and then more details
about the hotplug operation as well (e.g. which cpu, the node
association, how it is injected to the guest etc.).

Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-15 12:58             ` Michal Hocko
@ 2021-11-15 23:11               ` Alexey Makhalov
  2021-11-16  3:52                 ` Dennis Zhou
  0 siblings, 1 reply; 15+ messages in thread
From: Alexey Makhalov @ 2021-11-15 23:11 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dennis Zhou, Andrew Morton, linux-kernel, cl, mm-commits,
	osalvador, stable, tj

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]



> On Nov 15, 2021, at 4:58 AM, Michal Hocko <mhocko@suse.com> wrote:
> 
> On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
>> Hi Michal,
>> 
>>> 
>>> I have asked several times for details about the specific setup that has
>>> led to the reported crash. Without much success so far. Reproduction
>>> steps would be the first step. That would allow somebody to work on this
>>> at least if Alexey doesn't have time to dive into this deeper.
>>> 
>> 
>> I didn’t know that repro steps are still not clear.
>> 
>> To reproduce the panic you need to have a system, where you can hot add
>> the CPU that belongs to memoryless NUMA node which is not present and onlined
>> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
>> at the same time.
> 
> There seems to be something different in your setup because memory less
> nodes have reportedly worked on x86. I suspect something must be
> different in your setup. Maybe it is that you are adding a cpu that is
> outside of possible cpus intialized during boot time. Those should have
> their nodes initialized properly - at least per init_cpu_to_node. Your
> report doesn't really explain how the cpu is hotadded. Maybe you are
> trying to do something that has never been supported on x86.
Memoryless nodes are supported by x86. But hot add of such nodes not quite
done.

> 
> It would be really great if you can provide more information in the
> original email thread. E.g. boot time messges and then more details
> about the hotplug operation as well (e.g. which cpu, the node
> association, how it is injected to the guest etc.).
> 
I’ll provide more information in the main thread.



Regards,
—Alexey

[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-15 23:11               ` Alexey Makhalov
@ 2021-11-16  3:52                 ` Dennis Zhou
  2021-11-16 12:30                   ` Christoph Lameter
  2021-12-14 10:11                   ` Michal Hocko
  0 siblings, 2 replies; 15+ messages in thread
From: Dennis Zhou @ 2021-11-16  3:52 UTC (permalink / raw)
  To: Alexey Makhalov
  Cc: Michal Hocko, Dennis Zhou, Andrew Morton, linux-kernel, cl,
	mm-commits, osalvador, stable, tj

On Mon, Nov 15, 2021 at 11:11:44PM +0000, Alexey Makhalov wrote:
> 
> 
> > On Nov 15, 2021, at 4:58 AM, Michal Hocko <mhocko@suse.com> wrote:
> > 
> > On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
> >> Hi Michal,
> >> 
> >>> 
> >>> I have asked several times for details about the specific setup that has
> >>> led to the reported crash. Without much success so far. Reproduction
> >>> steps would be the first step. That would allow somebody to work on this
> >>> at least if Alexey doesn't have time to dive into this deeper.
> >>> 
> >> 
> >> I didn’t know that repro steps are still not clear.
> >> 
> >> To reproduce the panic you need to have a system, where you can hot add
> >> the CPU that belongs to memoryless NUMA node which is not present and onlined
> >> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
> >> at the same time.
> > 
> > There seems to be something different in your setup because memory less
> > nodes have reportedly worked on x86. I suspect something must be
> > different in your setup. Maybe it is that you are adding a cpu that is
> > outside of possible cpus intialized during boot time. Those should have
> > their nodes initialized properly - at least per init_cpu_to_node. Your
> > report doesn't really explain how the cpu is hotadded. Maybe you are
> > trying to do something that has never been supported on x86.
> Memoryless nodes are supported by x86. But hot add of such nodes not quite
> done.
> 

I need some clarification here. It sounds like memoryless nodes work on
x86, but hotplug + memoryless nodes isn't a supported use case or you're
introducing it as a new use case?

If this is a new use case, then I'm inclined to say this patch should
NOT go in and a proper fix should be implemented on hotplug's side. I
don't want to be in the business of having/seeing this conversation
reoccur because we just papered over this issue in percpu.

Thanks,
Dennis

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-16  3:52                 ` Dennis Zhou
@ 2021-11-16 12:30                   ` Christoph Lameter
  2021-11-16 15:41                     ` Michal Hocko
  2021-12-14 10:11                   ` Michal Hocko
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Lameter @ 2021-11-16 12:30 UTC (permalink / raw)
  To: Dennis Zhou
  Cc: Alexey Makhalov, Michal Hocko, Andrew Morton, linux-kernel,
	mm-commits, osalvador, stable, tj

On Mon, 15 Nov 2021, Dennis Zhou wrote:

> I need some clarification here. It sounds like memoryless nodes work on
> x86, but hotplug + memoryless nodes isn't a supported use case or you're
> introducing it as a new use case?

Could you do that step by step?

First add the new node and ensure everything is ok and that the memory is
online.

*After* that is done bring up the new processor and associate the
processor with *online* memory.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-16 12:30                   ` Christoph Lameter
@ 2021-11-16 15:41                     ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2021-11-16 15:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Dennis Zhou, Alexey Makhalov, Andrew Morton, linux-kernel,
	mm-commits, osalvador, stable, tj

On Tue 16-11-21 13:30:45, Christoph Lameter wrote:
> On Mon, 15 Nov 2021, Dennis Zhou wrote:
> 
> > I need some clarification here. It sounds like memoryless nodes work on
> > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > introducing it as a new use case?
> 
> Could you do that step by step?
> 
> First add the new node and ensure everything is ok and that the memory is
> online.
> 
> *After* that is done bring up the new processor and associate the
> processor with *online* memory.

We are discussing that in the original thread -
http://lkml.kernel.org/r/YZN3ExwL7BiDS5nj@dhcp22.suse.cz

This patch is a a workaround that problem in the pcp code.
 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-11-16  3:52                 ` Dennis Zhou
  2021-11-16 12:30                   ` Christoph Lameter
@ 2021-12-14 10:11                   ` Michal Hocko
  2021-12-14 20:57                     ` Andrew Morton
  1 sibling, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2021-12-14 10:11 UTC (permalink / raw)
  To: Dennis Zhou
  Cc: Alexey Makhalov, Andrew Morton, linux-kernel, cl, mm-commits,
	osalvador, stable, tj

On Mon 15-11-21 22:52:27, Dennis Zhou wrote:
> On Mon, Nov 15, 2021 at 11:11:44PM +0000, Alexey Makhalov wrote:
> > 
> > 
> > > On Nov 15, 2021, at 4:58 AM, Michal Hocko <mhocko@suse.com> wrote:
> > > 
> > > On Mon 15-11-21 11:04:16, Alexey Makhalov wrote:
> > >> Hi Michal,
> > >> 
> > >>> 
> > >>> I have asked several times for details about the specific setup that has
> > >>> led to the reported crash. Without much success so far. Reproduction
> > >>> steps would be the first step. That would allow somebody to work on this
> > >>> at least if Alexey doesn't have time to dive into this deeper.
> > >>> 
> > >> 
> > >> I didn’t know that repro steps are still not clear.
> > >> 
> > >> To reproduce the panic you need to have a system, where you can hot add
> > >> the CPU that belongs to memoryless NUMA node which is not present and onlined
> > >> yet. In other words, by hot adding CPU, you will add both CPU and NUMA node
> > >> at the same time.
> > > 
> > > There seems to be something different in your setup because memory less
> > > nodes have reportedly worked on x86. I suspect something must be
> > > different in your setup. Maybe it is that you are adding a cpu that is
> > > outside of possible cpus intialized during boot time. Those should have
> > > their nodes initialized properly - at least per init_cpu_to_node. Your
> > > report doesn't really explain how the cpu is hotadded. Maybe you are
> > > trying to do something that has never been supported on x86.
> > Memoryless nodes are supported by x86. But hot add of such nodes not quite
> > done.
> > 
> 
> I need some clarification here. It sounds like memoryless nodes work on
> x86, but hotplug + memoryless nodes isn't a supported use case or you're
> introducing it as a new use case?
> 
> If this is a new use case, then I'm inclined to say this patch should
> NOT go in and a proper fix should be implemented on hotplug's side. I
> don't want to be in the business of having/seeing this conversation
> reoccur because we just papered over this issue in percpu.

The patch still seems to be in the mmotm tree. I have sent a different
fix candidate [1] which should be more robust and cover also other potential
places.

[1] http://lkml.kernel.org/r/20211214100732.26335-1-mhocko@kernel.org
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-12-14 10:11                   ` Michal Hocko
@ 2021-12-14 20:57                     ` Andrew Morton
  2021-12-15 10:05                       ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2021-12-14 20:57 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Dennis Zhou, Alexey Makhalov, linux-kernel, cl, mm-commits,
	osalvador, stable, tj

On Tue, 14 Dec 2021 11:11:54 +0100 Michal Hocko <mhocko@suse.com> wrote:

> > I need some clarification here. It sounds like memoryless nodes work on
> > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > introducing it as a new use case?
> > 
> > If this is a new use case, then I'm inclined to say this patch should
> > NOT go in and a proper fix should be implemented on hotplug's side. I
> > don't want to be in the business of having/seeing this conversation
> > reoccur because we just papered over this issue in percpu.
> 
> The patch still seems to be in the mmotm tree. I have sent a different
> fix candidate [1] which should be more robust and cover also other potential
> places.
> 
> [1] http://lkml.kernel.org/r/20211214100732.26335-1-mhocko@kernel.org

Is cool, I'm paying attention.

We do want something short and simple for backporting to -stable (like
Alexey's patch) so please bear that in mind while preparing an
alternative.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-12-14 20:57                     ` Andrew Morton
@ 2021-12-15 10:05                       ` Michal Hocko
  2021-12-15 12:20                         ` Michal Hocko
  0 siblings, 1 reply; 15+ messages in thread
From: Michal Hocko @ 2021-12-15 10:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dennis Zhou, Alexey Makhalov, cl, mm-commits, osalvador, stable, tj

On Tue 14-12-21 12:57:48, Andrew Morton wrote:
> On Tue, 14 Dec 2021 11:11:54 +0100 Michal Hocko <mhocko@suse.com> wrote:
> 
> > > I need some clarification here. It sounds like memoryless nodes work on
> > > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > > introducing it as a new use case?
> > > 
> > > If this is a new use case, then I'm inclined to say this patch should
> > > NOT go in and a proper fix should be implemented on hotplug's side. I
> > > don't want to be in the business of having/seeing this conversation
> > > reoccur because we just papered over this issue in percpu.
> > 
> > The patch still seems to be in the mmotm tree. I have sent a different
> > fix candidate [1] which should be more robust and cover also other potential
> > places.
> > 
> > [1] http://lkml.kernel.org/r/20211214100732.26335-1-mhocko@kernel.org
> 
> Is cool, I'm paying attention.
> 
> We do want something short and simple for backporting to -stable (like
> Alexey's patch) so please bear that in mind while preparing an
> alternative.

I think we want something that fixes the underlying problem. Please keep
in mind that the pcp allocation is not the only place to hit the issue.
We have more. I do not want we want to handle each and every one
separately.

I am definitly not going to push for my solution but if there is a
consensus this is the right approach then I do not think we really want
to implement these partial workarounds.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: + mm-fix-panic-in-__alloc_pages.patch added to -mm tree
  2021-12-15 10:05                       ` Michal Hocko
@ 2021-12-15 12:20                         ` Michal Hocko
  0 siblings, 0 replies; 15+ messages in thread
From: Michal Hocko @ 2021-12-15 12:20 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dennis Zhou, Alexey Makhalov, cl, mm-commits, osalvador, stable, tj

On Wed 15-12-21 11:05:12, Michal Hocko wrote:
> On Tue 14-12-21 12:57:48, Andrew Morton wrote:
> > On Tue, 14 Dec 2021 11:11:54 +0100 Michal Hocko <mhocko@suse.com> wrote:
> > 
> > > > I need some clarification here. It sounds like memoryless nodes work on
> > > > x86, but hotplug + memoryless nodes isn't a supported use case or you're
> > > > introducing it as a new use case?
> > > > 
> > > > If this is a new use case, then I'm inclined to say this patch should
> > > > NOT go in and a proper fix should be implemented on hotplug's side. I
> > > > don't want to be in the business of having/seeing this conversation
> > > > reoccur because we just papered over this issue in percpu.
> > > 
> > > The patch still seems to be in the mmotm tree. I have sent a different
> > > fix candidate [1] which should be more robust and cover also other potential
> > > places.
> > > 
> > > [1] http://lkml.kernel.org/r/20211214100732.26335-1-mhocko@kernel.org
> > 
> > Is cool, I'm paying attention.
> > 
> > We do want something short and simple for backporting to -stable (like
> > Alexey's patch) so please bear that in mind while preparing an
> > alternative.
> 
> I think we want something that fixes the underlying problem. Please keep
> in mind that the pcp allocation is not the only place to hit the issue.
> We have more. I do not want we want to handle each and every one
> separately.
> 
> I am definitly not going to push for my solution but if there is a
> consensus this is the right approach then I do not think we really want
> to implement these partial workarounds.

Btw. I forgot to add that if we do not agree on the preallocation
approach then the approach should be something like 
http://lkml.kernel.org/r/51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com
proposed by David.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-12-15 12:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20211108205031.UxDPHBZWa%akpm@linux-foundation.org>
2021-11-09  8:37 ` + mm-fix-panic-in-__alloc_pages.patch added to -mm tree Michal Hocko
2021-11-09  8:42   ` David Hildenbrand
2021-11-09 11:00     ` Michal Hocko
2021-11-12 18:20       ` Dennis Zhou
2021-11-15 10:41         ` Michal Hocko
2021-11-15 11:04           ` Alexey Makhalov
2021-11-15 12:58             ` Michal Hocko
2021-11-15 23:11               ` Alexey Makhalov
2021-11-16  3:52                 ` Dennis Zhou
2021-11-16 12:30                   ` Christoph Lameter
2021-11-16 15:41                     ` Michal Hocko
2021-12-14 10:11                   ` Michal Hocko
2021-12-14 20:57                     ` Andrew Morton
2021-12-15 10:05                       ` Michal Hocko
2021-12-15 12:20                         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).