All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, shakeelb@google.com,
	ktkhai@virtuozzo.com, shy828301@gmail.com, guro@fb.com,
	vbabka@suse.cz, vdavydov.dev@gmail.com, raquini@redhat.com
Subject: Re: [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes
Date: Mon, 6 Dec 2021 13:43:27 +0100	[thread overview]
Message-ID: <51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com> (raw)
In-Reply-To: <Ya3yZWkj2wGRWDMz@dhcp22.suse.cz>

>> I'm not sure if it's rally whack-a-mole really applies. It's just the
>> for_each_node_* calls that need love. In other cases, we shouldn't
>> really stumble over an offline node.
>>
>> Someone deliberately decided to use "for_each_node()" instead of
>> for_each_online_node() without taking care of online vs. offline
>> semantics. That's just a BUG and needs fixing IMHO.
> 
> Well, I would argue this is too much to ask for the API users. It is
> also a trap that is just too easy to fall into. Turning for_each_node
> into for_each_online_node will not solve the problem just by itself.
> As you have pointed out an offline node can become online and without a
> hotplug notifier there is no way for_each_online_node would be a proper
> fit for anything that allocates memory only for online nodes. See the
> trap? People think they are doing the right thing but they are not in
> fact.

I agree that it requires care in the caller. And some callers don't seem
to know that the nid they hold in their hand is garbage and should be
used with care. And this is pretty much undocumented.

> 
> Now practically speaking !node_online should not apear node_online (note
> I am attentionally avoiding to say offline and online as that has a
> completely different semantic) shouldn't really happen for some
> architectures. x86 should allocate pgdat for each possible node. I do
> not know what was the architecture in this case but we already have
> another report for x86 that remains unexplained.

So we'd allocate the pgdat although all we want is just a zonelist. The
obvious alternative is to implement the fallback where reasonable -- for
example, in the page allocator. It knows the fallback order:
build_zonelists(). That's pretty much all we need the preferred_nid for.

So just making prepare_alloc_pages()/node_zonelist() deal with a missing
pgdat could make sense as well. Something like:


diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index b976c4177299..2d2649e78766 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -508,9 +508,14 @@ static inline int gfp_zonelist(gfp_t flags)
  *
  * For the case of non-NUMA systems the NODE_DATA() gets optimized to
  * &contig_page_data at compile-time.
+ *
+ * If the node does not have a pgdat yet, returns the zonelist of the
+ * first online node.
  */
 static inline struct zonelist *node_zonelist(int nid, gfp_t flags)
 {
+       if (unlikely(!NODE_DATA(nid)))
+               nid = first_online_node;
        return NODE_DATA(nid)->node_zonelists + gfp_zonelist(flags);
 }


But of course, there might be value in a proper node-aware fallback list
as we have in build_zonelists() -- but it remains questionable if the
difference for these corner cases would be relevant in practice.

Further, if we could have thousands of nodes, we'd have to update each
and every one when building zone lists ...

Removing the hotadd_new_pgdat() stuff does sound appealing, though.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2021-12-06 12:43 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-06  3:33 [RFC PATCH 0/2] mm: Dont allocate pages on a offline node Nico Pache
2021-12-06  3:33 ` [RFC PATCH 1/2] include/linux/gfp.h: Do not allocate pages on a offlined node Nico Pache
2021-12-06  3:37   ` Matthew Wilcox
2021-12-06  8:29   ` David Hildenbrand
2021-12-06  9:22   ` Michal Hocko
2021-12-07 21:24     ` Nico Pache
2021-12-06  3:33 ` [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes Nico Pache
2021-12-06  8:32   ` David Hildenbrand
2021-12-06  9:22   ` Michal Hocko
2021-12-06  9:24     ` Michal Hocko
2021-12-06 10:45     ` David Hildenbrand
2021-12-06 10:54       ` Michal Hocko
2021-12-06 11:00         ` David Hildenbrand
2021-12-06 11:22           ` Michal Hocko
2021-12-06 12:43             ` David Hildenbrand [this message]
2021-12-06 13:06               ` Michal Hocko
2021-12-06 13:47                 ` David Hildenbrand
2021-12-06 14:06                   ` Michal Hocko
2021-12-06 14:08                     ` David Hildenbrand
2021-12-06 14:21                       ` Michal Hocko
2021-12-06 14:30                         ` Vlastimil Babka
2021-12-06 14:53                           ` Michal Hocko
2021-12-06 18:26                             ` Yang Shi
2021-12-07 10:15                               ` Michal Hocko
2021-12-06 14:15                   ` Michal Hocko
2021-12-06 13:19       ` Kirill Tkhai
2021-12-06 13:24         ` Michal Hocko
2021-12-08 19:00           ` Nico Pache
2021-12-06 18:42         ` Yang Shi
2021-12-06 19:01           ` David Hildenbrand
2021-12-06 21:28             ` Yang Shi
2021-12-07 10:15               ` David Hildenbrand
2021-12-07 10:55             ` Michal Hocko
2021-12-07 21:45         ` Nico Pache
2021-12-07 21:40       ` Nico Pache
2021-12-07 21:34     ` Nico Pache
2021-12-06 18:45   ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=raquini@redhat.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.