All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: David Hildenbrand <david@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Alexey Makhalov <amakhalov@vmware.com>,
	Dennis Zhou <dennis@kernel.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Oscar Salvador <osalvador@suse.de>, Tejun Heo <tj@kernel.org>,
	Christoph Lameter <cl@linux.com>, Nico Pache <npache@redhat.com>,
	Wei Yang <richard.weiyang@gmail.com>,
	Rafael Aquini <raquini@redhat.com>
Subject: Re: [PATCH 4/6] mm, memory_hotplug: reorganize new pgdat initialization
Date: Thu, 27 Jan 2022 15:44:21 +0100	[thread overview]
Message-ID: <YfKvxVkkJ415DH4t@dhcp22.suse.cz> (raw)
In-Reply-To: <6db33bb0-c72a-5539-5873-14039702e2a3@redhat.com>

On Thu 27-01-22 13:46:53, David Hildenbrand wrote:
> On 27.01.22 09:53, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > When a !node_online node is brought up it needs a hotplug specific
> > initialization because the node could be either uninitialized yet or it
> > could have been recycled after previous hotremove. hotadd_init_pgdat is
> > responsible for that.
> > 
> > Internal pgdat state is initialized at two places currently
> > 	- hotadd_init_pgdat
> > 	- free_area_init_core_hotplug
> > There is no real clear cut what should go where but this patch's chosen to
> > move the whole internal state initialization into free_area_init_core_hotplug.
> > hotadd_init_pgdat is still responsible to pull all the parts together -
> > most notably to initialize zonelists because those depend on the overall topology.
> > 
> > This patch doesn't introduce any functional change.
> > 
> > Acked-by: Rafael Aquini <raquini@redhat.com>
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  include/linux/memory_hotplug.h |  2 +-
> >  mm/memory_hotplug.c            | 28 +++-------------------------
> >  mm/page_alloc.c                | 25 +++++++++++++++++++++++--
> >  3 files changed, 27 insertions(+), 28 deletions(-)
> > 
> > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > index 60f09d3ebb3d..76bf2de86def 100644
> > --- a/include/linux/memory_hotplug.h
> > +++ b/include/linux/memory_hotplug.h
> > @@ -319,7 +319,7 @@ extern void set_zone_contiguous(struct zone *zone);
> >  extern void clear_zone_contiguous(struct zone *zone);
> >  
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > -extern void __ref free_area_init_core_hotplug(int nid);
> > +extern void __ref free_area_init_core_hotplug(struct pglist_data *pgdat);
> >  extern int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
> >  extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags);
> >  extern int add_memory_resource(int nid, struct resource *resource,
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 875cdc7ffa58..ddc62f8b591f 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -1166,39 +1166,16 @@ static pg_data_t __ref *hotadd_init_pgdat(int nid)
> >  {
> >  	struct pglist_data *pgdat;
> >  
> > -	pgdat = NODE_DATA(nid);
> > -
> >  	/*
> >  	 * NODE_DATA is preallocated (free_area_init) but its internal
> >  	 * state is not allocated completely. Add missing pieces.
> >  	 * Completely offline nodes stay around and they just need
> >  	 * reintialization.
> >  	 */
> > -	if (pgdat->per_cpu_nodestats == &boot_nodestats) {
> > -		pgdat->per_cpu_nodestats =
> > -			alloc_percpu(struct per_cpu_nodestat);
> > -	} else {
> > -		int cpu;
> > -		/*
> > -		 * Reset the nr_zones, order and highest_zoneidx before reuse.
> > -		 * Note that kswapd will init kswapd_highest_zoneidx properly
> > -		 * when it starts in the near future.
> > -		 */
> > -		pgdat->nr_zones = 0;
> > -		pgdat->kswapd_order = 0;
> > -		pgdat->kswapd_highest_zoneidx = 0;
> > -		for_each_online_cpu(cpu) {
> > -			struct per_cpu_nodestat *p;
> > -
> > -			p = per_cpu_ptr(pgdat->per_cpu_nodestats, cpu);
> > -			memset(p, 0, sizeof(*p));
> > -		}
> > -	}
> > -
> > -	pgdat->node_start_pfn = 0;
> > +	pgdat = NODE_DATA(nid);
> >  
> >  	/* init node's zones as empty zones, we don't have any present pages.*/
> > -	free_area_init_core_hotplug(nid);
> > +	free_area_init_core_hotplug(pgdat);
> >  
> >  	/*
> >  	 * The node we allocated has no zone fallback lists. For avoiding
> > @@ -1210,6 +1187,7 @@ static pg_data_t __ref *hotadd_init_pgdat(int nid)
> >  	 * When memory is hot-added, all the memory is in offline state. So
> >  	 * clear all zones' present_pages because they will be updated in
> >  	 * online_pages() and offline_pages().
> > +	 * TODO: should be in free_area_init_core_hotplug?
> >  	 */
> >  	reset_node_managed_pages(pgdat);
> >  	reset_node_present_pages(pgdat);
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 1a05669044d3..32d0189de4c5 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7506,12 +7506,33 @@ static void __meminit zone_init_internals(struct zone *zone, enum zone_type idx,
> >   * NOTE: this function is only called during memory hotplug
> >   */
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > -void __ref free_area_init_core_hotplug(int nid)
> > +void __ref free_area_init_core_hotplug(struct pglist_data *pgdat)
> >  {
> > +	int nid = pgdat->node_id;
> >  	enum zone_type z;
> > -	pg_data_t *pgdat = NODE_DATA(nid);
> > +	int cpu;
> >  
> >  	pgdat_init_internals(pgdat);
> > +
> > +	if (pgdat->per_cpu_nodestats == &boot_nodestats)
> > +		pgdat->per_cpu_nodestats = alloc_percpu(struct per_cpu_nodestat);
> > +
> > +	/*
> > +	 * Reset the nr_zones, order and highest_zoneidx before reuse.
> > +	 * Note that kswapd will init kswapd_highest_zoneidx properly
> > +	 * when it starts in the near future.
> > +	 */
> > +	pgdat->nr_zones = 0;
> > +	pgdat->kswapd_order = 0;
> > +	pgdat->kswapd_highest_zoneidx = 0;
> > +	pgdat->node_start_pfn = 0;
> > +	for_each_online_cpu(cpu) {
> > +		struct per_cpu_nodestat *p;
> > +
> > +		p = per_cpu_ptr(pgdat->per_cpu_nodestats, cpu);
> > +		memset(p, 0, sizeof(*p));
> > +	}
> > +
> >  	for (z = 0; z < MAX_NR_ZONES; z++)
> >  		zone_init_internals(&pgdat->node_zones[z], z, nid, 0);
> >  }
> 
> I feel like we should be initializing all of this only once, just after
> allocating the node. There should be no difference between a node we're
> reusing and a "fresh" node. IOW, memory offlining should be resetting
> all state accordingly when the last memory goes away.
> 
> But I might be wrong and this patch looks like an improvement, as you
> say,  without functional change

Yeah, I really wanted to have this simple and straightforward. To be
completely honest I am not even sure this is necessary. Something really
woth looking at.
 
> Acked-by: David Hildenbrand <david@redhat.com>

Thanks!

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2022-01-27 14:44 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27  8:52 [PATCH 0/6] mm, memory_hotplug: handle unitialized numa node gracefully Michal Hocko
2022-01-27  8:53 ` [PATCH 1/6] mm, memory_hotplug: make arch_alloc_nodedata independent on CONFIG_MEMORY_HOTPLUG Michal Hocko
2022-01-27 12:27   ` David Hildenbrand
2022-01-27 13:36   ` Mike Rapoport
2022-01-28  6:18   ` Oscar Salvador
2022-02-01  2:13   ` Wei Yang
2022-01-27  8:53 ` [PATCH 2/6] mm: handle uninitialized numa nodes gracefully Michal Hocko
2022-01-27 12:41   ` David Hildenbrand
2022-01-27 14:50     ` Michal Hocko
2022-01-27 13:37   ` Mike Rapoport
2022-01-27 14:47     ` Michal Hocko
2022-01-27 15:04       ` Mike Rapoport
2022-02-01  2:41       ` Wei Yang
2022-02-01  9:54         ` Michal Hocko
2022-02-03  0:21           ` Wei Yang
2022-02-03  7:23             ` Mike Rapoport
2022-02-03  8:27               ` David Hildenbrand
2022-02-03  9:08                 ` Mike Rapoport
2022-02-03  9:11                   ` David Hildenbrand
2022-02-03  9:19                     ` Michal Hocko
2022-02-03  9:21                       ` David Hildenbrand
2022-02-03  8:39               ` Michal Hocko
2022-02-04 22:54                 ` Andrew Morton
2022-01-28  6:27   ` Oscar Salvador
2022-01-31 10:34   ` Michal Hocko
2022-01-27  8:53 ` [PATCH 3/6] mm, memory_hotplug: drop arch_free_nodedata Michal Hocko
2022-01-27 12:42   ` David Hildenbrand
2022-01-27 13:37   ` Mike Rapoport
2022-01-28  6:29   ` Oscar Salvador
2022-02-01  2:41   ` Wei Yang
2022-01-27  8:53 ` [PATCH 4/6] mm, memory_hotplug: reorganize new pgdat initialization Michal Hocko
2022-01-27 12:46   ` David Hildenbrand
2022-01-27 14:44     ` Michal Hocko [this message]
2022-02-17 10:40       ` Oscar Salvador
2022-01-27 13:39   ` Mike Rapoport
2022-01-27 14:45     ` Michal Hocko
2022-01-28 10:51   ` Oscar Salvador
2022-02-01  2:42   ` Wei Yang
2022-01-27  8:53 ` [PATCH 5/6] mm: make free_area_init_node aware of memory less nodes Michal Hocko
2022-01-27 12:47   ` David Hildenbrand
2022-01-27 13:34   ` Mike Rapoport
2022-01-28 10:59   ` Oscar Salvador
2022-02-01  2:43   ` Wei Yang
2022-01-27  8:53 ` [PATCH 6/6] memcg: do not tweak node in alloc_mem_cgroup_per_node_info Michal Hocko
2022-01-27 12:50   ` David Hildenbrand
2022-01-28 11:01   ` Oscar Salvador
2022-02-01  2:45   ` Wei Yang
2022-02-01  9:51     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfKvxVkkJ415DH4t@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=amakhalov@vmware.com \
    --cc=cl@linux.com \
    --cc=david@redhat.com \
    --cc=dennis@kernel.org \
    --cc=eric.dumazet@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=osalvador@suse.de \
    --cc=raquini@redhat.com \
    --cc=richard.weiyang@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.