All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	Tang Chen <tangchen@cn.fujitsu.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Zhang Zhen <zhenzhang.zhang@huawei.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Dan Williams <dan.j.williams@gmail.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: [PATCH 0/6] mm: make movable onlining suck less
Date: Mon, 3 Apr 2017 22:23:38 +0200	[thread overview]
Message-ID: <20170403202337.GA12482@dhcp22.suse.cz> (raw)
In-Reply-To: <20170403195830.64libncet5l6vuvb@arbab-laptop>

On Mon 03-04-17 14:58:30, Reza Arbab wrote:
> On Mon, Apr 03, 2017 at 01:55:46PM +0200, Michal Hocko wrote:
> >Anyting? I would really appreciate a feedback from IBM and Futjitsu guys
> >who have shaped this code last few years. Also Igor and Vitaly seem to be
> >using memory hotplug in virtualized environments. I do not expect they
> >would see a huge advantage of the rework but I would appreciate to give it
> >some testing to catch any potential regressions.
> 
> Sorry for the delayed reply.
> 
> With this set, I'm able to "online_movable" blocks in ascending order.
> 
> However, I am seeing a regression. When adding memory to a memoryless node,
> it shows up in node 0 instead. I'm digging to see if I can help narrow down
> where things go wrong.

OK, I guess I know what is going on here. online_pages relies on
pfn_to_nid(pfn) to return a proper node. But we are doing
page_to_nid(pfn_to_page(__pfn_to_nid_pfn)) so we rely on the page being
properly initialized. Damn, I should have noticed that. There are two
ways around that. Either the __add_section stores the nid into the
struct page and make page_to_nid reliable or store it somewhere else
(ideally into the memblock). The first is easier (let's do it for now)
but longterm we do not want to rely on the struct page at all I believe.

Does the following help?
---
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b9dc1c4e26c3..0e21b9f67c9d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -309,14 +309,19 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn)
 
 	/*
 	 * Make all the pages reserved so that nobody will stumble over half
-	 * initialized state.
+	 * initialized state. 
+	 * FIXME: We also have to associate it with a node because pfn_to_node
+	 * relies on having page with the proper node.
 	 */
 	for (i = 0; i < PAGES_PER_SECTION; i++) {
 		unsigned long pfn = phys_start_pfn + i;
+		struct page *page;
 		if (!pfn_valid(pfn))
 			continue;
 
-		SetPageReserved(pfn_to_page(phys_start_pfn + i));
+		page = pfn_to_page(pfn);
+		set_page_node(page, nid);
+		SetPageReserved(page);
 	}
 
 	return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Vlastimil Babka <vbabka@suse.cz>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	Tang Chen <tangchen@cn.fujitsu.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Zhang Zhen <zhenzhang.zhang@huawei.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Dan Williams <dan.j.williams@gmail.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: [PATCH 0/6] mm: make movable onlining suck less
Date: Mon, 3 Apr 2017 22:23:38 +0200	[thread overview]
Message-ID: <20170403202337.GA12482@dhcp22.suse.cz> (raw)
In-Reply-To: <20170403195830.64libncet5l6vuvb@arbab-laptop>

On Mon 03-04-17 14:58:30, Reza Arbab wrote:
> On Mon, Apr 03, 2017 at 01:55:46PM +0200, Michal Hocko wrote:
> >Anyting? I would really appreciate a feedback from IBM and Futjitsu guys
> >who have shaped this code last few years. Also Igor and Vitaly seem to be
> >using memory hotplug in virtualized environments. I do not expect they
> >would see a huge advantage of the rework but I would appreciate to give it
> >some testing to catch any potential regressions.
> 
> Sorry for the delayed reply.
> 
> With this set, I'm able to "online_movable" blocks in ascending order.
> 
> However, I am seeing a regression. When adding memory to a memoryless node,
> it shows up in node 0 instead. I'm digging to see if I can help narrow down
> where things go wrong.

OK, I guess I know what is going on here. online_pages relies on
pfn_to_nid(pfn) to return a proper node. But we are doing
page_to_nid(pfn_to_page(__pfn_to_nid_pfn)) so we rely on the page being
properly initialized. Damn, I should have noticed that. There are two
ways around that. Either the __add_section stores the nid into the
struct page and make page_to_nid reliable or store it somewhere else
(ideally into the memblock). The first is easier (let's do it for now)
but longterm we do not want to rely on the struct page at all I believe.

Does the following help?
---
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b9dc1c4e26c3..0e21b9f67c9d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -309,14 +309,19 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn)
 
 	/*
 	 * Make all the pages reserved so that nobody will stumble over half
-	 * initialized state.
+	 * initialized state. 
+	 * FIXME: We also have to associate it with a node because pfn_to_node
+	 * relies on having page with the proper node.
 	 */
 	for (i = 0; i < PAGES_PER_SECTION; i++) {
 		unsigned long pfn = phys_start_pfn + i;
+		struct page *page;
 		if (!pfn_valid(pfn))
 			continue;
 
-		SetPageReserved(pfn_to_page(phys_start_pfn + i));
+		page = pfn_to_page(pfn);
+		set_page_node(page, nid);
+		SetPageReserved(page);
 	}
 
 	return register_new_memory(nid, __pfn_to_section(phys_start_pfn));
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-04-03 20:25 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-30 11:54 [PATCH 0/6] mm: make movable onlining suck less Michal Hocko
2017-03-30 11:54 ` Michal Hocko
2017-03-30 11:54 ` [PATCH 1/6] mm: get rid of zone_is_initialized Michal Hocko
2017-03-30 11:54   ` Michal Hocko
2017-03-31  3:39   ` Hillf Danton
2017-03-31  3:39     ` Hillf Danton
2017-03-31  6:43     ` Michal Hocko
2017-03-31  6:43       ` Michal Hocko
2017-03-31  6:48       ` Michal Hocko
2017-03-31  6:48         ` Michal Hocko
2017-03-31  7:39   ` [PATCH v1 " Michal Hocko
2017-03-31  7:39     ` Michal Hocko
2017-04-05  8:14     ` Michal Hocko
2017-04-05  8:14       ` Michal Hocko
2017-04-05  9:06       ` Igor Mammedov
2017-04-05  9:06         ` Igor Mammedov
2017-04-05  9:23         ` Michal Hocko
2017-04-05  9:23           ` Michal Hocko
2017-03-30 11:54 ` [PATCH 2/6] mm, tile: drop arch_{add,remove}_memory Michal Hocko
2017-03-30 11:54   ` Michal Hocko
2017-03-30 15:41   ` Chris Metcalf
2017-03-30 15:41     ` Chris Metcalf
2017-03-30 11:54 ` [PATCH 3/6] mm: remove return value from init_currently_empty_zone Michal Hocko
2017-03-30 11:54   ` Michal Hocko
2017-03-31  3:49   ` Hillf Danton
2017-03-31  3:49     ` Hillf Danton
2017-03-31  6:49     ` Michal Hocko
2017-03-31  6:49       ` Michal Hocko
2017-03-31  7:06       ` Hillf Danton
2017-03-31  7:06         ` Hillf Danton
2017-03-31  7:18         ` Michal Hocko
2017-03-31  7:18           ` Michal Hocko
2017-03-31  7:43   ` Michal Hocko
2017-03-31  7:43     ` Michal Hocko
2017-04-03 21:22   ` Reza Arbab
2017-04-03 21:22     ` Reza Arbab
2017-04-04  7:30     ` Michal Hocko
2017-04-04  7:30       ` Michal Hocko
2017-03-30 11:54 ` [PATCH 4/6] mm, memory_hotplug: use node instead of zone in can_online_high_movable Michal Hocko
2017-03-30 11:54   ` Michal Hocko
2017-03-30 11:54 ` [PATCH 5/6] mm, memory_hotplug: do not associate hotadded memory to zones until online Michal Hocko
2017-03-30 11:54   ` Michal Hocko
2017-03-31  6:18   ` Hillf Danton
2017-03-31  6:18     ` Hillf Danton
2017-03-31  6:50     ` Michal Hocko
2017-03-31  6:50       ` Michal Hocko
2017-04-04 12:21   ` Tobias Regnery
2017-04-04 12:21     ` Tobias Regnery
2017-04-04 12:45     ` Michal Hocko
2017-04-04 12:45       ` Michal Hocko
2017-04-06  8:14   ` Michal Hocko
2017-04-06  8:14     ` Michal Hocko
2017-04-06 12:46   ` Michal Hocko
2017-04-06 12:46     ` Michal Hocko
2017-03-30 11:54 ` [PATCH 6/6] mm, memory_hotplug: remove unused cruft after memory hotplug rework Michal Hocko
2017-03-30 11:54   ` Michal Hocko
2017-03-31  7:46   ` Michal Hocko
2017-03-31  7:46     ` Michal Hocko
2017-03-31 19:19 ` [PATCH 0/6] mm: make movable onlining suck less Heiko Carstens
2017-03-31 19:19   ` Heiko Carstens
2017-04-03  7:34   ` Michal Hocko
2017-04-03  7:34     ` Michal Hocko
2017-04-03 11:55 ` Michal Hocko
2017-04-03 11:55   ` Michal Hocko
2017-04-03 12:20   ` Igor Mammedov
2017-04-03 12:20     ` Igor Mammedov
2017-04-03 19:58   ` Reza Arbab
2017-04-03 19:58     ` Reza Arbab
2017-04-03 20:23     ` Michal Hocko [this message]
2017-04-03 20:23       ` Michal Hocko
2017-04-03 20:42       ` Reza Arbab
2017-04-03 20:42         ` Reza Arbab
2017-04-04  7:23         ` Michal Hocko
2017-04-04  7:23           ` Michal Hocko
2017-04-04  7:34           ` Michal Hocko
2017-04-04  7:34             ` Michal Hocko
2017-04-04  8:23             ` Michal Hocko
2017-04-04  8:23               ` Michal Hocko
2017-04-04 15:59               ` Reza Arbab
2017-04-04 15:59                 ` Reza Arbab
2017-04-04 16:02               ` Reza Arbab
2017-04-04 16:02                 ` Reza Arbab
2017-04-04 16:44                 ` Michal Hocko
2017-04-04 16:44                   ` Michal Hocko
2017-04-04 18:30                   ` Reza Arbab
2017-04-04 18:30                     ` Reza Arbab
2017-04-04 19:41                     ` Michal Hocko
2017-04-04 19:41                       ` Michal Hocko
2017-04-04 21:43                       ` Reza Arbab
2017-04-04 21:43                         ` Reza Arbab
2017-04-05  6:42                         ` Michal Hocko
2017-04-05  6:42                           ` Michal Hocko
2017-04-05  9:24                           ` Michal Hocko
2017-04-05  9:24                             ` Michal Hocko
2017-04-05 14:53                             ` Reza Arbab
2017-04-05 14:53                               ` Reza Arbab
2017-04-05 15:42                               ` Michal Hocko
2017-04-05 15:42                                 ` Michal Hocko
2017-04-05 17:32                                 ` Reza Arbab
2017-04-05 17:32                                   ` Reza Arbab
2017-04-05 18:15                                   ` Michal Hocko
2017-04-05 18:15                                     ` Michal Hocko
2017-04-05 19:39                                     ` Michal Hocko
2017-04-05 19:39                                       ` Michal Hocko
2017-04-05 21:02                                     ` Michal Hocko
2017-04-05 21:02                                       ` Michal Hocko
2017-04-06 11:07                                       ` Michal Hocko
2017-04-06 11:07                                         ` Michal Hocko
2017-04-05 15:48                           ` Reza Arbab
2017-04-05 15:48                             ` Reza Arbab
2017-04-05 16:34                             ` Michal Hocko
2017-04-05 16:34                               ` Michal Hocko
2017-04-05 20:55                               ` Reza Arbab
2017-04-05 20:55                                 ` Reza Arbab
2017-04-06  9:25                               ` Michal Hocko
2017-04-06  9:25                                 ` Michal Hocko
2017-04-05 13:52                         ` Michal Hocko
2017-04-05 13:52                           ` Michal Hocko
2017-04-05 15:23                           ` Reza Arbab
2017-04-05 15:23                             ` Reza Arbab
2017-04-05  6:36                       ` Michal Hocko
2017-04-05  6:36                         ` Michal Hocko
2017-04-06 13:08 ` Michal Hocko
2017-04-06 13:08   ` Michal Hocko
2017-04-06 15:24   ` Reza Arbab
2017-04-06 15:24     ` Reza Arbab
2017-04-06 15:41     ` Michal Hocko
2017-04-06 15:41       ` Michal Hocko
2017-04-06 15:46       ` Reza Arbab
2017-04-06 15:46         ` Reza Arbab
2017-04-06 16:21         ` Michal Hocko
2017-04-06 16:21           ` Michal Hocko
2017-04-06 16:24           ` Mel Gorman
2017-04-06 16:24             ` Mel Gorman
2017-04-06 16:55           ` Mel Gorman
2017-04-06 16:55             ` Mel Gorman
2017-04-06 17:12             ` Michal Hocko
2017-04-06 17:12               ` Michal Hocko
2017-04-06 17:46               ` Mel Gorman
2017-04-06 17:46                 ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170403202337.GA12482@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=cmetcalf@mellanox.com \
    --cc=dan.j.williams@gmail.com \
    --cc=daniel.kiper@oracle.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=imammedo@redhat.com \
    --cc=js1304@gmail.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=qiuxishi@huawei.com \
    --cc=rientjes@google.com \
    --cc=schwidefsky@de.ibm.com \
    --cc=slaoub@gmail.com \
    --cc=tangchen@cn.fujitsu.com \
    --cc=toshi.kani@hpe.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=yasu.isimatu@gmail.com \
    --cc=zhenzhang.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.