All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand <david@redhat.com>,
	Nico Pache <npache@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	akpm@linux-foundation.org, shakeelb@google.com,
	ktkhai@virtuozzo.com, shy828301@gmail.com, guro@fb.com,
	vdavydov.dev@gmail.com, raquini@redhat.com
Subject: Re: [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes
Date: Mon, 6 Dec 2021 15:53:56 +0100	[thread overview]
Message-ID: <Ya4kBASzAJ32UBfT@dhcp22.suse.cz> (raw)
In-Reply-To: <f2c695f0-9621-a7be-82c3-8850dc8ca3e3@suse.cz>

On Mon 06-12-21 15:30:37, Vlastimil Babka wrote:
> On 12/6/21 15:21, Michal Hocko wrote:
> > On Mon 06-12-21 15:08:10, David Hildenbrand wrote:
> >> 
> >> >> But there might be more missing. Onlining a new zone will get more
> >> >> expensive in setups with a lot of possible nodes (x86-64 shouldn't
> >> >> really be an issue in that regard).
> >> > 
> >> > Honestly, I am not really concerned by platforms with too many nodes
> >> > without any memory. If they want to shoot their feet then that's their
> >> > choice. We can optimize for those if they ever prove to be standar.
> >> >  
> >> >> If we want stable backports, we'll want something simple upfront.
> >> > 
> >> > For stable backports I would be fine by doing your NODE_DATA check in
> >> > the allocator. In upstream I think we should be aiming for a more robust
> >> > solution that is also easier to maintain further down the line. Even if
> >> > that is an investment at this momemnt because the initialization code is
> >> > a mess.
> >> > 
> >> 
> >> Agreed. I would be curious *why* we decided to dynamically allocate the
> >> pgdat. is this just a historical coincidence or was there real reason to
> >> not allocate it for all possible nodes during boot?
> > 
> > I don't know but if I was to guess the most likely explanation would be
> > that the numa init code was in a similar order as now and it was easier
> > to simply allocate a pgdat when a new one was onlined.
> > 9af3c2dea3a3 ("[PATCH] pgdat allocation for new node add (call pgdat allocation)")
> > doesn't really tell much.
> 
> I don't know if that's true for pgdat specifically, but generally IMHO the
> advantages of allocating during/after online instead for each possible is
> - memory savings when some possible node is actually never online
> - at least in some cases, the allocations can be local to the node in
> question where the advantages is
>   - faster access
>   - less memory occupied on nodes that are earlier online, especially node 0
> 
> So while the approach of allocate on boot for all possible nodes instead of
> just online nodes has advantages of being generally safer and simpler (no
> memory hotplug callbacks etc), we should also be careful not to overdo this
> approach so we don't end up with Node 0 memory filled with structures used
> for nodes 1-X that are just onlined later. I imagine that could be a problem
> even for "sane" archs that don't have tons of possible, but offline nodes.

Yes this can indeed turn out to be a problem as the memory allocations
scales not only with numa nodes but memcgs as well. The later one being
a more visible one.

> Concretely, pgdat should probably be fine, but things like all shrinkers?
> Maybe less so.

Yeah, right. But for that purpose the concept of online_node is just
misleading. You would need a check whether the node is populated with
memory and implement hotplug notifiers.

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2021-12-06 14:53 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-06  3:33 [RFC PATCH 0/2] mm: Dont allocate pages on a offline node Nico Pache
2021-12-06  3:33 ` [RFC PATCH 1/2] include/linux/gfp.h: Do not allocate pages on a offlined node Nico Pache
2021-12-06  3:37   ` Matthew Wilcox
2021-12-06  8:29   ` David Hildenbrand
2021-12-06  9:22   ` Michal Hocko
2021-12-07 21:24     ` Nico Pache
2021-12-06  3:33 ` [RFC PATCH 2/2] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes Nico Pache
2021-12-06  8:32   ` David Hildenbrand
2021-12-06  9:22   ` Michal Hocko
2021-12-06  9:24     ` Michal Hocko
2021-12-06 10:45     ` David Hildenbrand
2021-12-06 10:54       ` Michal Hocko
2021-12-06 11:00         ` David Hildenbrand
2021-12-06 11:22           ` Michal Hocko
2021-12-06 12:43             ` David Hildenbrand
2021-12-06 13:06               ` Michal Hocko
2021-12-06 13:47                 ` David Hildenbrand
2021-12-06 14:06                   ` Michal Hocko
2021-12-06 14:08                     ` David Hildenbrand
2021-12-06 14:21                       ` Michal Hocko
2021-12-06 14:30                         ` Vlastimil Babka
2021-12-06 14:53                           ` Michal Hocko [this message]
2021-12-06 18:26                             ` Yang Shi
2021-12-07 10:15                               ` Michal Hocko
2021-12-06 14:15                   ` Michal Hocko
2021-12-06 13:19       ` Kirill Tkhai
2021-12-06 13:24         ` Michal Hocko
2021-12-08 19:00           ` Nico Pache
2021-12-06 18:42         ` Yang Shi
2021-12-06 19:01           ` David Hildenbrand
2021-12-06 21:28             ` Yang Shi
2021-12-07 10:15               ` David Hildenbrand
2021-12-07 10:55             ` Michal Hocko
2021-12-07 21:45         ` Nico Pache
2021-12-07 21:40       ` Nico Pache
2021-12-07 21:34     ` Nico Pache
2021-12-06 18:45   ` Yang Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya4kBASzAJ32UBfT@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=guro@fb.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=raquini@redhat.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.