All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexey Makhalov <amakhalov@vmware.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Dennis Zhou <dennis@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	Oscar Salvador <osalvador@suse.de>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: [PATCH v3] mm: fix panic in __alloc_pages
Date: Thu, 9 Dec 2021 02:16:17 +0000	[thread overview]
Message-ID: <77BCF61E-224F-435D-8620-670C9E874A9A@vmware.com> (raw)
In-Reply-To: <YbBywDwc2bCxWGAQ@dhcp22.suse.cz>

Hi Michal,


> On Dec 8, 2021, at 12:54 AM, Michal Hocko <mhocko@suse.com> wrote:
> 
> Alexey,
> this is still not finalized but it would really help if you could give
> it a spin on your setup. I still have to think about how to transition
> from a memoryless node to standard node (in hotplug code). Also there
> might be other surprises on the way.
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c5952749ad40..8ed8db2ccb13 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6382,7 +6382,11 @@ static void __build_all_zonelists(void *data)
> 	if (self && !node_online(self->node_id)) {
> 		build_zonelists(self);
> 	} else {
> -		for_each_online_node(nid) {
> +		/*
> +		 * All possible nodes have pgdat preallocated
> +		 * free_area_init
> +		 */
> +		for_each_node(nid) {
> 			pg_data_t *pgdat = NODE_DATA(nid);
> 
> 			build_zonelists(pgdat);
> @@ -8032,8 +8036,32 @@ void __init free_area_init(unsigned long *max_zone_pfn)
> 	/* Initialise every node */
> 	mminit_verify_pageflags_layout();
> 	setup_nr_node_ids();
> -	for_each_online_node(nid) {
> -		pg_data_t *pgdat = NODE_DATA(nid);
> +	for_each_node(nid) {
> +		pg_data_t *pgdat;
> +
> +		if (!node_online(nid)) {
> +			pr_warn("Node %d uninitialized by the platform. Please report with boot dmesg.\n", nid);
> +
> +			/* Allocator not initialized yet */
> +			pgdat = memblock_alloc(sizeof(*pgdat), SMP_CACHE_BYTES);
> +			if (!pgdat) {
> +				pr_err("Cannot allocate %zuB for node %d.\n",
> +						sizeof(*pgdat), nid);
> +				continue;
> +			}
> +			/* TODO do we need this for memoryless nodes */
> +			pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
> +			arch_refresh_nodedata(nid, pgdat);
> +			free_area_init_memoryless_node(nid);
> +			/*
> +			 * not marking this node online because we do not want to
> +			 * confuse userspace by sysfs files/directories for node
> +			 * without any memory attached to it (see topology_init)
> +			 */
> +			continue;
> +		}
> +
> +		pgdat = NODE_DATA(nid);
> 		free_area_init_node(nid);
> 
> 		/* Any memory on that node */


After applying this patch, kernel panics in early boot with:
[    0.081838] Initmem setup node 0 [mem 0x0000000000001000-0x000000007fffffff]
[    0.081842] Initmem setup node 1 [mem 0x0000000080000000-0x000000013fffffff]
[    0.081844] Node 2 uninitialized by the platform. Please report with boot dmesg.
[    0.081877] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    0.081879] #PF: supervisor read access in kernel mode
[    0.081882] #PF: error_code(0x0000) - not-present page
[    0.081884] PGD 0 P4D 0
[    0.081887] Oops: 0000 [#1] SMP PTI
[    0.081890] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0+ #33
[    0.081893] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW
[    0.081896] RIP: 0010:pcpu_alloc+0x330/0x850
[    0.081903] Code: c7 c7 e4 38 5b 82 e8 5f b5 60 00 81 7d ac c0 0c 00 00 0f 85 f1 04 00 00 48
[    0.081906] RSP: 0000:ffffffff82003dc0 EFLAGS: 00010046
[    0.081909] RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000cc0
[    0.081911] RDX: 0000000000000003 RSI: 0000000000000006 RDI: ffffffff825b38e4
[    0.081913] RBP: ffffffff82003e40 R08: ffff88813ffb7480 R09: 0000000000001000
[    0.081915] R10: 0000000000001000 R11: 000000013ffff000 R12: 0000000000000001
[    0.081917] R13: 0000000001a2c000 R14: 0000000000000000 R15: 0000000000000003
[    0.081919] FS:  0000000000000000(0000) GS:ffffffff822ee000(0000) knlGS:0000000000000000
[    0.081921] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.081923] CR2: 0000000000000000 CR3: 000000000200a000 CR4: 00000000000606b0
[    0.081946] Call Trace:
[    0.081951]  __alloc_percpu+0x15/0x20
[    0.081954]  free_area_init+0x270/0x300
[    0.081960]  zone_sizes_init+0x44/0x46
[    0.081965]  paging_init+0x23/0x25
[    0.081969]  setup_arch+0x5aa/0x668
[    0.081973]  start_kernel+0x53/0x5b6
[    0.081978]  x86_64_start_reservations+0x24/0x26
[    0.081983]  x86_64_start_kernel+0x70/0x74
[    0.081986]  secondary_startup_64_no_verify+0xb0/0xbb
[    0.081991] Modules linked in:
[    0.081993] CR2: 0000000000000000
[    0.081996] random: get_random_bytes called from oops_exit+0x39/0x60 with crng_init=0


pcpu_alloc+0x330 is
/root/linux-5.15.0/mm/percpu.c:1833
        if (list_empty(&pcpu_chunk_lists[pcpu_free_slot])) {
    359e:       48 63 05 00 00 00 00    movslq 0x0(%rip),%rax        # 35a5 <pcpu_alloc+0x325>
                        35a1: R_X86_64_PC32     .data..ro_after_init+0x5c
    35a5:       48 c1 e0 04             shl    $0x4,%rax
    35a9:       48 03 05 00 00 00 00    add    0x0(%rip),%rax        # 35b0 <pcpu_alloc+0x330>
                        35ac: R_X86_64_PC32     pcpu_chunk_lists-0x4
list_empty():
/root/linux-5.15.0/./include/linux/list.h:282
        return READ_ONCE(head->next) == head;
    35b0:       48 8b 10                mov    (%rax),%rdx                     <— rax == 0



free_area_init() -> /* added by patch */ alloc_percpu() -> pcpu_alloc():
        /*
         * No space left.  Create a new chunk.  We don't want multiple
         * tasks to create chunks simultaneously.  Serialize and create iff
         * there's still no empty chunk after grabbing the mutex.
         */
        if (is_atomic) {
                err = "atomic alloc failed, no space left";
                goto fail;
        }

        if (list_empty(&pcpu_chunk_lists[pcpu_free_slot])) {                     <— &pcpu_chunk_lists[pcpu_free_slot]) == NULL
                chunk = pcpu_create_chunk(pcpu_gfp);
                if (!chunk) {
                        err = "failed to allocate new chunk";
                        goto fail;
                }

                spin_lock_irqsave(&pcpu_lock, flags);
                pcpu_chunk_relocate(chunk, -1);
        } else { 


This patch calls alloc_percpu() from setup_arch() while percpu allocator is not yet initialized (before setup_per_cpu_areas()).

Thanks,
—Alexey


  parent reply	other threads:[~2021-12-09  2:16 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01 20:13 [PATCH] mm: fix panic in __alloc_pages Alexey Makhalov
2021-11-01 20:38 ` Matthew Wilcox
2021-11-02  7:47 ` Michal Hocko
2021-11-02  8:12   ` David Hildenbrand
2021-11-02  8:48     ` Alexey Makhalov
2021-11-02  9:04       ` Michal Hocko
2021-11-02  9:24         ` David Hildenbrand
2021-11-02 10:34           ` Alexey Makhalov
2021-11-02 11:00             ` David Hildenbrand
2021-11-02 11:44               ` Michal Hocko
2021-11-02 12:06                 ` David Hildenbrand
2021-11-02 12:27                   ` Michal Hocko
2021-11-02 12:39                     ` David Hildenbrand
2021-11-02 13:25                       ` Michal Hocko
2021-11-02 13:41                         ` David Hildenbrand
2021-11-02 14:12                           ` Michal Hocko
2021-11-02 14:44                             ` David Hildenbrand
2021-11-02 13:52                         ` Oscar Salvador
2021-11-02 14:35                           ` Michal Hocko
2021-11-08  6:12                   ` Alexey Makhalov
2021-11-08  6:36                     ` [PATCH v2] " Alexey Makhalov
2021-11-08  8:32                       ` David Hildenbrand
2021-11-08 20:23                         ` [PATCH v3] " Alexey Makhalov
2021-11-09  2:08                           ` Eric Dumazet
2021-11-09  7:03                             ` David Hildenbrand
2021-11-09 16:55                               ` Eric Dumazet
2021-11-09 17:15                             ` Michal Hocko
2021-11-09 19:06                               ` Dennis Zhou
2021-11-09 19:54                                 ` Michal Hocko
2021-11-16  1:31                                   ` Alexey Makhalov
2021-11-16  9:17                                     ` Michal Hocko
2021-11-16 20:22                                       ` Alexey Makhalov
2021-11-18  8:35                                         ` Michal Hocko
2021-12-07 10:54                                           ` Michal Hocko
2021-12-07 11:08                                             ` David Hildenbrand
2021-12-07 12:13                                               ` Michal Hocko
2021-12-07 12:28                                                 ` David Hildenbrand
2021-12-07 13:23                                                   ` Michal Hocko
2021-12-07 15:09                                                     ` David Hildenbrand
2021-12-07 15:29                                                       ` Michal Hocko
2021-12-07 15:34                                                         ` David Hildenbrand
2021-12-07 15:56                                                           ` Michal Hocko
2021-12-07 16:09                                                             ` David Hildenbrand
2021-12-07 16:27                                                               ` Michal Hocko
2021-12-07 16:36                                                                 ` Michal Hocko
2021-12-07 16:40                                                                   ` David Hildenbrand
2021-12-08  8:28                                                                     ` Michal Hocko
2021-12-07 17:02                                                                   ` Alexey Makhalov
2021-12-07 17:13                                                                     ` David Hildenbrand
2021-12-07 17:17                                                                       ` Alexey Makhalov
2021-12-07 18:03                                                                         ` David Hildenbrand
2021-12-08  8:12                                                                           ` Michal Hocko
2021-12-08  8:24                                                                             ` David Hildenbrand
2021-12-08  8:34                                                                               ` Michal Hocko
2021-12-08  8:38                                                                                 ` David Hildenbrand
2021-12-08  8:04                                                                         ` Michal Hocko
2021-12-08  8:19                                                                           ` Alexey Makhalov
2021-12-08  8:30                                                                             ` Michal Hocko
2021-12-08  8:54                                             ` Michal Hocko
2021-12-08  8:57                                               ` Alexey Makhalov
2021-12-08  9:55                                                 ` Michal Hocko
2021-12-09  2:16                                               ` Alexey Makhalov [this message]
2021-12-09  8:46                                                 ` Michal Hocko
2021-12-09  9:28                                                   ` Alexey Makhalov
2021-12-09  9:56                                                     ` Michal Hocko
2021-12-09 10:23                                                       ` Alexey Makhalov
2021-12-09 13:29                                                         ` Michal Hocko
2021-12-09 19:01                                                           ` Alexey Makhalov
2021-12-10  9:11                                                             ` Michal Hocko
2021-12-17 12:53                                                               ` Michal Hocko
2021-12-21  5:46                                                                 ` Alexey Makhalov
2021-12-21  9:46                                                                   ` Michal Hocko
2021-12-21 20:23                                                                     ` Alexey Makhalov
2021-12-22 11:41                                                                       ` Michal Hocko
2021-12-09 10:48                                             ` Michal Hocko
2021-12-13 15:06                                               ` Michal Hocko
2021-12-13 15:07                                                 ` David Hildenbrand
2021-12-14  8:38                                                   ` Michal Hocko
2021-12-14 10:07                                               ` [PATCH v2 0/4] mm, memory_hotplug: handle unitialized numa node gracefully Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 1/4] mm, memory_hotplug: make arch_alloc_nodedata independent on CONFIG_MEMORY_HOTPLUG Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 2/4] mm: handle uninitialized numa nodes gracefully Michal Hocko
2021-12-14 10:33                                                   ` Christoph Lameter
2021-12-14 10:38                                                     ` Michal Hocko
2022-01-14  0:24                                                       ` Wei Yang
2022-01-14 10:01                                                         ` Michal Hocko
2021-12-15  4:47                                                   ` kernel test robot
2021-12-15  4:47                                                     ` kernel test robot
2021-12-15 10:12                                                     ` Michal Hocko
2021-12-15 10:12                                                       ` Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 3/4] mm, memory_hotplug: drop arch_free_nodedata Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 4/4] mm, memory_hotplug: reorganize new pgdat initialization Michal Hocko
2021-12-17 14:51                                                 ` [PATCH v2 0/4] mm, memory_hotplug: handle unitialized numa node gracefully David Hildenbrand
2021-12-21  9:51                                                   ` Michal Hocko
2022-01-02  7:14                                                     ` Mike Rapoport
2022-01-10 17:16                                                       ` Michal Hocko
2022-01-10 21:16                                                 ` Rafael Aquini
2022-01-11  8:34                                                   ` Michal Hocko
2021-11-08 10:37                       ` [PATCH v2] mm: fix panic in __alloc_pages Michal Hocko
2021-11-02  9:40         ` [PATCH] " Alexey Makhalov
2021-11-02  9:40         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77BCF61E-224F-435D-8620-670C9E874A9A@vmware.com \
    --to=amakhalov@vmware.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.