All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Alexey Makhalov <amakhalov@vmware.com>
Cc: David Hildenbrand <david@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	Oscar Salvador <OSalvador@suse.com>
Subject: Re: [PATCH] mm: fix panic in __alloc_pages
Date: Tue, 2 Nov 2021 10:04:22 +0100	[thread overview]
Message-ID: <YYD/FkpAk5IvmOux@dhcp22.suse.cz> (raw)
In-Reply-To: <C69EF2FE-DFF6-492E-AD40-97A53739C3EC@vmware.com>

It is hard to follow your reply as your email client is not quoting
properly. Let me try to reconstruct

On Tue 02-11-21 08:48:27, Alexey Makhalov wrote:
> On 02.11.21 08:47, Michal Hocko wrote:
[...]
>>>>  CPU2 has been hot-added
>>>>  BUG: unable to handle page fault for address: 0000000000001608
>>>>  #PF: supervisor read access in kernel mode
>>>>  #PF: error_code(0x0000) - not-present page
>>>>  PGD 0 P4D 0
>>>>  Oops: 0000 [#1] SMP PTI
>>>>  CPU: 0 PID: 1 Comm: systemd Tainted: G            E     5.15.0-rc7+ #11
>>>>  Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW
>>>>
>>>>  RIP: 0010:__alloc_pages+0x127/0x290
>>> 
>>> Could you resolve this into a specific line of the source code please?

This got probably unnoticed. I would be really curious whether this is
a broken zonelist or something else.
 
>>>> Node can be in one of the following states:
>>>> 1. not present (nid == NUMA_NO_NODE)
>>>> 2. present, but offline (nid > NUMA_NO_NODE, node_online(nid) == 0,
>>>> 				NODE_DATA(nid) == NULL)
>>>> 3. present and online (nid > NUMA_NO_NODE, node_online(nid) > 0,
>>>> 				NODE_DATA(nid) != NULL)
>>>>
>>>> alloc_page_{bulk_array}node() functions verify for nid validity only
>>>> and do not check if nid is online. Enhanced verification check allows
>>>> to handle page allocation when node is in 2nd state.
>>> 
>>> I do not think this is a correct approach. We should make sure that the
>>> proper fallback node is used instead. This means that the zone list is
>>> initialized properly. IIRC this has been a problem in the past and it
>>> has been fixed. The initialization code is quite subtle though so it is
>>> possible that this got broken again.

> This approach behaves in the same way as CPU was not yet added. (state #1).
> So, we can think of state #2 as state #1 when CPU is not present.

>> I'm a little confused:
>> 
>> In add_memory_resource() we hotplug the new node if required and set it
>> online. Memory might get onlined later, via online_pages().
>
> You are correct. In case of memory hot add, it is true. But in case of adding
> CPU with memoryless node, try_node_online() will be called only during CPU
> onlining, see cpu_up().
> 
> Is there any reason why try_online_node() resides in cpu_up() and not in add_cpu()?
> I think it would be correct to online node during the CPU hot add to align with
> memory hot add.

I am not familiar with cpu hotplug, but this doesn't seem to be anything
new so how come this became problem only now?

>> So after add_memory_resource()->__try_online_node() succeeded, we have
>> an online pgdat -- essentially 3.
>> 
> This patch detects if we're past 3. but says that it reproduced by
> disabling *memory* onlining.
> This is the hot adding of both new CPU and new _memoryless_ node (with CPU only)
> And onlining CPU makes its node online. Disabling CPU onlining puts new node
> into state #2, which leads to repro.    
> 
>> Before we online memory for a hotplugged node, all zones are !populated.
>> So once we online memory for a !populated zone in online_pages(), we
>> trigger setup_zone_pageset().
>> 
>> 
>> The confusing part is that this patch checks for 3. but says it can be
>> reproduced by not onlining *memory*. There seems to be something missing.
> 
> Do we maybe need a proper populated_zone() check before accessing zone data?

No, we need them initialize properly.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-11-02  9:04 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01 20:13 [PATCH] mm: fix panic in __alloc_pages Alexey Makhalov
2021-11-01 20:38 ` Matthew Wilcox
2021-11-02  7:47 ` Michal Hocko
2021-11-02  8:12   ` David Hildenbrand
2021-11-02  8:48     ` Alexey Makhalov
2021-11-02  9:04       ` Michal Hocko [this message]
2021-11-02  9:24         ` David Hildenbrand
2021-11-02 10:34           ` Alexey Makhalov
2021-11-02 11:00             ` David Hildenbrand
2021-11-02 11:44               ` Michal Hocko
2021-11-02 12:06                 ` David Hildenbrand
2021-11-02 12:27                   ` Michal Hocko
2021-11-02 12:39                     ` David Hildenbrand
2021-11-02 13:25                       ` Michal Hocko
2021-11-02 13:41                         ` David Hildenbrand
2021-11-02 14:12                           ` Michal Hocko
2021-11-02 14:44                             ` David Hildenbrand
2021-11-02 13:52                         ` Oscar Salvador
2021-11-02 14:35                           ` Michal Hocko
2021-11-08  6:12                   ` Alexey Makhalov
2021-11-08  6:36                     ` [PATCH v2] " Alexey Makhalov
2021-11-08  8:32                       ` David Hildenbrand
2021-11-08 20:23                         ` [PATCH v3] " Alexey Makhalov
2021-11-09  2:08                           ` Eric Dumazet
2021-11-09  7:03                             ` David Hildenbrand
2021-11-09 16:55                               ` Eric Dumazet
2021-11-09 17:15                             ` Michal Hocko
2021-11-09 19:06                               ` Dennis Zhou
2021-11-09 19:54                                 ` Michal Hocko
2021-11-16  1:31                                   ` Alexey Makhalov
2021-11-16  9:17                                     ` Michal Hocko
2021-11-16 20:22                                       ` Alexey Makhalov
2021-11-18  8:35                                         ` Michal Hocko
2021-12-07 10:54                                           ` Michal Hocko
2021-12-07 11:08                                             ` David Hildenbrand
2021-12-07 12:13                                               ` Michal Hocko
2021-12-07 12:28                                                 ` David Hildenbrand
2021-12-07 13:23                                                   ` Michal Hocko
2021-12-07 15:09                                                     ` David Hildenbrand
2021-12-07 15:29                                                       ` Michal Hocko
2021-12-07 15:34                                                         ` David Hildenbrand
2021-12-07 15:56                                                           ` Michal Hocko
2021-12-07 16:09                                                             ` David Hildenbrand
2021-12-07 16:27                                                               ` Michal Hocko
2021-12-07 16:36                                                                 ` Michal Hocko
2021-12-07 16:40                                                                   ` David Hildenbrand
2021-12-08  8:28                                                                     ` Michal Hocko
2021-12-07 17:02                                                                   ` Alexey Makhalov
2021-12-07 17:13                                                                     ` David Hildenbrand
2021-12-07 17:17                                                                       ` Alexey Makhalov
2021-12-07 18:03                                                                         ` David Hildenbrand
2021-12-08  8:12                                                                           ` Michal Hocko
2021-12-08  8:24                                                                             ` David Hildenbrand
2021-12-08  8:34                                                                               ` Michal Hocko
2021-12-08  8:38                                                                                 ` David Hildenbrand
2021-12-08  8:04                                                                         ` Michal Hocko
2021-12-08  8:19                                                                           ` Alexey Makhalov
2021-12-08  8:30                                                                             ` Michal Hocko
2021-12-08  8:54                                             ` Michal Hocko
2021-12-08  8:57                                               ` Alexey Makhalov
2021-12-08  9:55                                                 ` Michal Hocko
2021-12-09  2:16                                               ` Alexey Makhalov
2021-12-09  8:46                                                 ` Michal Hocko
2021-12-09  9:28                                                   ` Alexey Makhalov
2021-12-09  9:56                                                     ` Michal Hocko
2021-12-09 10:23                                                       ` Alexey Makhalov
2021-12-09 13:29                                                         ` Michal Hocko
2021-12-09 19:01                                                           ` Alexey Makhalov
2021-12-10  9:11                                                             ` Michal Hocko
2021-12-17 12:53                                                               ` Michal Hocko
2021-12-21  5:46                                                                 ` Alexey Makhalov
2021-12-21  9:46                                                                   ` Michal Hocko
2021-12-21 20:23                                                                     ` Alexey Makhalov
2021-12-22 11:41                                                                       ` Michal Hocko
2021-12-09 10:48                                             ` Michal Hocko
2021-12-13 15:06                                               ` Michal Hocko
2021-12-13 15:07                                                 ` David Hildenbrand
2021-12-14  8:38                                                   ` Michal Hocko
2021-12-14 10:07                                               ` [PATCH v2 0/4] mm, memory_hotplug: handle unitialized numa node gracefully Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 1/4] mm, memory_hotplug: make arch_alloc_nodedata independent on CONFIG_MEMORY_HOTPLUG Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 2/4] mm: handle uninitialized numa nodes gracefully Michal Hocko
2021-12-14 10:33                                                   ` Christoph Lameter
2021-12-14 10:38                                                     ` Michal Hocko
2022-01-14  0:24                                                       ` Wei Yang
2022-01-14 10:01                                                         ` Michal Hocko
2021-12-15  4:47                                                   ` kernel test robot
2021-12-15  4:47                                                     ` kernel test robot
2021-12-15 10:12                                                     ` Michal Hocko
2021-12-15 10:12                                                       ` Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 3/4] mm, memory_hotplug: drop arch_free_nodedata Michal Hocko
2021-12-14 10:07                                                 ` [PATCH v2 4/4] mm, memory_hotplug: reorganize new pgdat initialization Michal Hocko
2021-12-17 14:51                                                 ` [PATCH v2 0/4] mm, memory_hotplug: handle unitialized numa node gracefully David Hildenbrand
2021-12-21  9:51                                                   ` Michal Hocko
2022-01-02  7:14                                                     ` Mike Rapoport
2022-01-10 17:16                                                       ` Michal Hocko
2022-01-10 21:16                                                 ` Rafael Aquini
2022-01-11  8:34                                                   ` Michal Hocko
2021-11-08 10:37                       ` [PATCH v2] mm: fix panic in __alloc_pages Michal Hocko
2021-11-02  9:40         ` [PATCH] " Alexey Makhalov
2021-11-02  9:40         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYD/FkpAk5IvmOux@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=OSalvador@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=amakhalov@vmware.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.