All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Alexey Makhalov <amakhalov@vmware.com>, linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Michal Hocko <mhocko@suse.com>,
	Oscar Salvador <osalvador@suse.de>,
	Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH v3] mm: fix panic in __alloc_pages
Date: Mon, 8 Nov 2021 18:08:52 -0800	[thread overview]
Message-ID: <2e191db3-286f-90c6-bf96-3f89891e9926@gmail.com> (raw)
In-Reply-To: <20211108202325.20304-1-amakhalov@vmware.com>



On 11/8/21 12:23 PM, Alexey Makhalov wrote:
> There is a kernel panic caused by pcpu_alloc_pages() passing
> offlined and uninitialized node to alloc_pages_node() leading
> to panic by NULL dereferencing uninitialized NODE_DATA(nid).
> 
>  CPU2 has been hot-added
>  BUG: unable to handle page fault for address: 0000000000001608
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 0 P4D 0
>  Oops: 0000 [#1] SMP PTI
>  CPU: 0 PID: 1 Comm: systemd Tainted: G            E     5.15.0-rc7+ #11
>  Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW
> 
>  RIP: 0010:__alloc_pages+0x127/0x290
>  Code: 4c 89 f0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 44 89 e0 48 8b 55 b8 c1 e8 0c 83 e0 01 88 45 d0 4c 89 c8 48 85 d2 0f 85 1a 01 00 00 <45> 3b 41 08 0f 82 10 01 00 00 48 89 45 c0 48 8b 00 44 89 e2 81 e2
>  RSP: 0018:ffffc900006f3bc8 EFLAGS: 00010246
>  RAX: 0000000000001600 RBX: 0000000000000000 RCX: 0000000000000000
>  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000cc2
>  RBP: ffffc900006f3c18 R08: 0000000000000001 R09: 0000000000001600
>  R10: ffffc900006f3a40 R11: ffff88813c9fffe8 R12: 0000000000000cc2
>  R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000cc2
>  FS:  00007f27ead70500(0000) GS:ffff88807ce00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 0000000000001608 CR3: 000000000582c003 CR4: 00000000001706b0
>  Call Trace:
>   pcpu_alloc_pages.constprop.0+0xe4/0x1c0
>   pcpu_populate_chunk+0x33/0xb0
>   pcpu_alloc+0x4d3/0x6f0
>   __alloc_percpu_gfp+0xd/0x10
>   alloc_mem_cgroup_per_node_info+0x54/0xb0
>   mem_cgroup_alloc+0xed/0x2f0
>   mem_cgroup_css_alloc+0x33/0x2f0
>   css_create+0x3a/0x1f0
>   cgroup_apply_control_enable+0x12b/0x150
>   cgroup_mkdir+0xdd/0x110
>   kernfs_iop_mkdir+0x4f/0x80
>   vfs_mkdir+0x178/0x230
>   do_mkdirat+0xfd/0x120
>   __x64_sys_mkdir+0x47/0x70
>   ? syscall_exit_to_user_mode+0x21/0x50
>   do_syscall_64+0x43/0x90
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Panic can be easily reproduced by disabling udev rule for
> automatic onlining hot added CPU followed by CPU with
> memoryless node (NUMA node with CPU only) hot add.
> 
> Hot adding CPU and memoryless node does not bring the node
> to online state. Memoryless node will be onlined only during
> the onlining its CPU.
> 
> Node can be in one of the following states:
> 1. not present.(nid == NUMA_NO_NODE)
> 2. present, but offline (nid > NUMA_NO_NODE, node_online(nid) == 0,
> 				NODE_DATA(nid) == NULL)
> 3. present and online (nid > NUMA_NO_NODE, node_online(nid) > 0,
> 				NODE_DATA(nid) != NULL)
> 
> Percpu code is doing allocations for all possible CPUs. The
> issue happens when it serves hot added but not yet onlined
> CPU when its node is in 2nd state. This node is not ready
> to use, fallback to numa_mem_id().
> 
> Signed-off-by: Alexey Makhalov <amakhalov@vmware.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Dennis Zhou <dennis@kernel.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Cc: stable@vger.kernel.org
> ---
>  mm/percpu-vm.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c
> index 2054c9213..f58d73c92 100644
> --- a/mm/percpu-vm.c
> +++ b/mm/percpu-vm.c
> @@ -84,15 +84,19 @@ static int pcpu_alloc_pages(struct pcpu_chunk *chunk,
>  			    gfp_t gfp)
>  {
>  	unsigned int cpu, tcpu;
> -	int i;
> +	int i, nid;
>  
>  	gfp |= __GFP_HIGHMEM;
>  
>  	for_each_possible_cpu(cpu) {
> +		nid = cpu_to_node(cpu);
> +		if (nid == NUMA_NO_NODE || !node_online(nid))
> +			nid = numa_mem_id();

Maybe we should fail this fallback if (gfp & __GFP_THISNODE) ?

Or maybe there is no support for this constraint in per-cpu allocator anyway.

I am a bit worried that we do not really know if pages are
allocated on the right node or not.

Some workloads could really be hurt if all per-cpu pages were
put on a single NUMA node.

> +
>  		for (i = page_start; i < page_end; i++) {
>  			struct page **pagep = &pages[pcpu_page_idx(cpu, i)];
>  
> -			*pagep = alloc_pages_node(cpu_to_node(cpu), gfp, 0);
> +			*pagep = alloc_pages_node(nid, gfp, 0);
>  			if (!*pagep)
>  				goto err;
>  		}
> 

  reply	other threads:[~2021-11-09  2:09 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01 20:13 [PATCH] mm: fix panic in __alloc_pages Alexey Makhalov
2021-11-01 20:38 ` Matthew Wilcox
2021-11-02  7:47 ` Michal Hocko
2021-11-02  8:12   ` David Hildenbrand
2021-11-02  8:48     ` Alexey Makhalov
2021-11-02  9:04       ` Michal Hocko
2021-11-02  9:24         ` David Hildenbrand
2021-11-02 10:34           ` Alexey Makhalov
2021-11-02 11:00             ` David Hildenbrand
2021-11-02 11:44               ` Michal Hocko
2021-11-02 12:06                 ` David Hildenbrand
2021-11-02 12:27                   ` Michal Hocko
2021-11-02 12:39                     ` David Hildenbrand
2021-11-02 13:25                       ` Michal Hocko
2021-11-02 13:41                         ` David Hildenbrand
2021-11-02 14:12                           ` Michal Hocko
2021-11-02 14:44                             ` David Hildenbrand
2021-11-02 13:52                         ` Oscar Salvador
2021-11-02 14:35                           ` Michal Hocko
2021-11-08  6:12                   ` Alexey Makhalov
2021-11-08  6:36                     ` [PATCH v2] " Alexey Makhalov
2021-11-08  8:32                       ` David Hildenbrand
2021-11-08 20:23                         ` [PATCH v3] " Alexey Makhalov
2021-11-09  2:08                           ` Eric Dumazet [this message]
2021-11-09  7:03                             ` David Hildenbrand
2021-11-09 16:55                               ` Eric Dumazet
2021-11-09 17:15                             ` Michal Hocko
2021-11-09 19:06                               ` Dennis Zhou
2021-11-09 19:54                                 ` Michal Hocko
2021-11-16  1:31                                   ` Alexey Makhalov
2021-11-16  9:17                                     ` Michal Hocko
2021-11-16 20:22                                       ` Alexey Makhalov
2021-11-18  8:35                                         ` Michal Hocko
2021-12-07 10:54                                           ` Michal Hocko
2021-12-07 11:08                                             ` David Hildenbrand
2021-12-07 12:13                                               ` Michal Hocko
2021-12-07 12:28                                                 ` David Hildenbrand
2021-12-07 13:23                                                   ` Michal Hocko
2021-12-07 15:09                                                     ` David Hildenbrand
2021-12-07 15:29                                                       ` Michal Hocko
2021-12-07 15:34                                                         ` David Hildenbrand
2021-12-07 15:56                                                           ` Michal Hocko
2021-12-07 16:09                                                             ` David Hildenbrand
2021-12-07 16:27                                                               ` Michal Hocko
2021-12-07 16:36                                                                 ` Michal Hocko
2021-12-07 16:40                                                                   ` David Hildenbrand
2021-12-08  8:28                                                                     ` Michal Hocko
2021-12-07 17:02                                                                   ` Alexey Makhalov
2021-12-07 17:13                                                                     ` David Hildenbrand
2021-12-07 17:17                                                                       ` Alexey Makhalov
2021-12-07 18:03                                                                         ` David Hildenbrand
2021-12-08  8:12                                                                           ` Michal Hocko
2021-12-08  8:24                                                                             ` David Hildenbrand
2021-12-08  8:34                                                                               ` Michal Hocko
2021-12-08  8:38                                                                                 ` David Hildenbrand
2021-12-08  8:04                                                                         ` Michal Hocko
2021-12-08  8:19                                                                           ` Alexey Makhalov
2021-12-08  8:30                                                                             ` Michal Hocko
2021-12-08  8:54                                             ` Michal Hocko
2021-12-08  8:57                                               ` Alexey Makhalov
2021-12-08  9:55                                                 ` Michal Hocko
2021-12-09  2:16                                               ` Alexey Makhalov
2021-12-09  8:46                                                 ` Michal Hocko
2021-12-09  9:28                                                   ` Alexey Makhalov
2021-12-09  9:56                                                     ` Michal Hocko
2021-12-09 10:23                                                       ` Alexey Makhalov
2021-12-09 13:29                                                         ` Michal Hocko
2021-12-09 19:01                                                           ` Alexey Makhalov
2021-12-10  9:11                                                             ` Michal Hocko
2021-12-17 12:53                                                               ` Michal Hocko
2021-12-21  5:46                                                                 ` Alexey Makhalov
2021-12-21  9:46                                                                   ` Michal Hocko
2021-12-21 20:23                                                                     ` Alexey Makhalov
2021-12-22 11:41                                                                       ` Michal Hocko
2021-12-09 10:48                                             ` Michal Hocko
2021-12-13 15:06                                               ` Michal Hocko
2021-12-13 15:07                                                 ` David Hildenbrand
2021-12-14  8:38                                                   ` Michal Hocko
2021-12-14 10:07                                               ` [PATCH v2 0/4] mm, memory_hotplug: handle unitialized numa node gracefully Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 1/4] mm, memory_hotplug: make arch_alloc_nodedata independent on CONFIG_MEMORY_HOTPLUG Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 2/4] mm: handle uninitialized numa nodes gracefully Michal Hocko
2021-12-14 10:33                                                   ` Christoph Lameter
2021-12-14 10:38                                                     ` Michal Hocko
2022-01-14  0:24                                                       ` Wei Yang
2022-01-14 10:01                                                         ` Michal Hocko
2021-12-15  4:47                                                   ` kernel test robot
2021-12-15  4:47                                                     ` kernel test robot
2021-12-15 10:12                                                     ` Michal Hocko
2021-12-15 10:12                                                       ` Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 3/4] mm, memory_hotplug: drop arch_free_nodedata Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 4/4] mm, memory_hotplug: reorganize new pgdat initialization Michal Hocko
2021-12-17 14:51                                                 ` [PATCH v2 0/4] mm, memory_hotplug: handle unitialized numa node gracefully David Hildenbrand
2021-12-21  9:51                                                   ` Michal Hocko
2022-01-02  7:14                                                     ` Mike Rapoport
2022-01-10 17:16                                                       ` Michal Hocko
2022-01-10 21:16                                                 ` Rafael Aquini
2022-01-11  8:34                                                   ` Michal Hocko
2021-11-08 10:37                       ` [PATCH v2] mm: fix panic in __alloc_pages Michal Hocko
2021-11-02  9:40         ` [PATCH] " Alexey Makhalov
2021-11-02  9:40         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e191db3-286f-90c6-bf96-3f89891e9926@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=amakhalov@vmware.com \
    --cc=cl@linux.com \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.