All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>,
	Rafael Aquini <raquini@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	Wei Yang <richard.weiyang@gmail.com>,
	Dennis Zhou <dennis@kernel.org>,
	Alexey Makhalov <amakhalov@vmware.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Oscar Salvador <osalvador@suse.de>
Subject: [PATCH 0/1] Fix allocating nodes twice on x86
Date: Fri, 18 Feb 2022 23:43:01 +0100	[thread overview]
Message-ID: <20220218224302.5282-1-osalvador@suse.de> (raw)

Hi,

I was about to send this patch as a part of a cleanup patchset that goes on
top of last Michal's work about properly init memoryless nodes [1], but I
thought I may send it as a standalone one as it is the only real fix of
that series.

Discussing the fix with Michal, another concern came up, about whether
memoryless-nodes but with CPUs should be marked as online at all.
Here Michal and I have different opinions, while I think that
memoryless-nodes with its CPUs online should be marked online,
he thinks that the property 'online' applied to node has to do more
with memory, and I have to confess that after having a look at 
some of the usages of for_each_online_node, most of them are checked
to do something memory-realted afterwards, so I guess he has a point
there.

My main concern is that if memoryless-nodes do not get online, its
sysfs showing the cpu<->numa topology will not be created and I am
not sure about losing that information.
E.g: i guess 'numactl -H' would not be able to show the right
topolovy as it coul not walk sysfs to get it right (in case it does).

Anyway, as a quick test, I wanted to see what happens if init_node_to_cpu()
and init_gi_nodes(), do not online the nodes.
(those nodes get online because of the CPU and Generic Initiator
affinity).
The outcome is that we blow up badly down the chain, and we do because
of a nasty side-effect (more information can be found at the end of
patch#1)
Short-long story: by the time bringup_nonboot_cpus() gets called to
bring up the cpus and bring up their respective nodes, we have not even
registered the node_subsys bus yet, so
bringup_nonboot_cpus()->..->__try_online_node->()register_one_node()->bus_add_device()
will crash when dereferencing some of the bus' struct fields.

We happen not to crash now, because init_to_cpu_node() and
init_gi_nodes() mark the node online, and by doing that,
it has the effect of __try_online_node() backing off and not
calling register_one_node().

I think the whole thing is kinda broken, or at the very least subtle and
fragile.

I am willing to have a look at how we can make this optimal, but wanted
to share with you.

Anyway, It is getting late here, so I just wanted to 1) send this quick
fix, 2) expose some nasty behavior and 3) kick off a discussion about
how to improve this situation.

[1] https://lore.kernel.org/lkml/20220127085305.20890-1-mhocko@kernel.org/

Oscar Salvador (1):
  arch/x86/mm/numa: Do not initialize nodes twice

 arch/x86/mm/numa.c | 15 ++-------------
 include/linux/mm.h |  1 -
 mm/page_alloc.c    |  2 +-
 3 files changed, 3 insertions(+), 15 deletions(-)

-- 
2.34.1


             reply	other threads:[~2022-02-18 22:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-18 22:43 Oscar Salvador [this message]
2022-02-18 22:43 ` [PATCH 1/1] arch/x86/mm/numa: Do not initialize nodes twice Oscar Salvador
2022-02-21  9:20   ` Michal Hocko
2022-02-21  9:47     ` Oscar Salvador
2022-02-21 13:32       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220218224302.5282-1-osalvador@suse.de \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=amakhalov@vmware.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=raquini@redhat.com \
    --cc=richard.weiyang@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.