linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
@ 2019-09-09 11:48 David Hildenbrand
  2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Michal Hocko, Oscar Salvador, Pavel Tatashin, Qian Cai,
	Sasha Levin, Stephen Hemminger, Wei Yang

Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"

Let's replace the __online_page...() functions by generic_online_page().
Hyper-V only wants to delay the actual onlining of un-backed pages, so we
can simpy re-use the generic function.

Only compile-tested.

Cc: Souptick Joarder <jrdr.linux@gmail.com>

David Hildenbrand (3):
  mm/memory_hotplug: Export generic_online_page()
  hv_balloon: Use generic_online_page()
  mm/memory_hotplug: Remove __online_page_free() and
    __online_page_increment_counters()

 drivers/hv/hv_balloon.c        |  3 +--
 include/linux/memory_hotplug.h |  4 +---
 mm/memory_hotplug.c            | 17 ++---------------
 3 files changed, 4 insertions(+), 20 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 1/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
@ 2019-09-09 11:48 ` David Hildenbrand
  2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand,
	Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin,
	Dan Williams, Wei Yang, Qian Cai

Let's expose generic_online_page() so online_page_callback users can
simply fallback to the generic implementation when actually deciding to
online the pages.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memory_hotplug.h | 1 +
 mm/memory_hotplug.c            | 5 ++---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 8ee3a2ae5131..71a620eabb62 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -102,6 +102,7 @@ extern unsigned long __offline_isolated_pages(unsigned long start_pfn,
 
 typedef void (*online_page_callback_t)(struct page *page, unsigned int order);
 
+extern void generic_online_page(struct page *page, unsigned int order);
 extern int set_online_page_callback(online_page_callback_t callback);
 extern int restore_online_page_callback(online_page_callback_t callback);
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 2f0d2908e235..f32a5feaf7ff 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -49,8 +49,6 @@
  * and restore_online_page_callback() for generic callback restore.
  */
 
-static void generic_online_page(struct page *page, unsigned int order);
-
 static online_page_callback_t online_page_callback = generic_online_page;
 static DEFINE_MUTEX(online_page_callback_lock);
 
@@ -616,7 +614,7 @@ void __online_page_free(struct page *page)
 }
 EXPORT_SYMBOL_GPL(__online_page_free);
 
-static void generic_online_page(struct page *page, unsigned int order)
+void generic_online_page(struct page *page, unsigned int order)
 {
 	kernel_map_pages(page, 1 << order, 1);
 	__free_pages_core(page, order);
@@ -626,6 +624,7 @@ static void generic_online_page(struct page *page, unsigned int order)
 		totalhigh_pages_add(1UL << order);
 #endif
 }
+EXPORT_SYMBOL_GPL(generic_online_page);
 
 static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
 			void *arg)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 2/3] hv_balloon: Use generic_online_page()
  2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
  2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand
@ 2019-09-09 11:48 ` David Hildenbrand
  2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand
  2019-09-20  8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
  3 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Sasha Levin

Let's use the generic onlining function - which will now also take care
of calling kernel_map_pages().

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sasha Levin <sashal@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 drivers/hv/hv_balloon.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
index a91c90d4402c..35f123b459c8 100644
--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -680,8 +680,7 @@ static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg)
 		__ClearPageOffline(pg);
 
 	/* This frame is currently backed; online the page. */
-	__online_page_increment_counters(pg);
-	__online_page_free(pg);
+	generic_online_page(pg, 0);
 
 	lockdep_assert_held(&dm_device.ha_lock);
 	dm_device.num_pages_onlined++;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters()
  2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
  2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand
  2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand
@ 2019-09-09 11:48 ` David Hildenbrand
  2019-09-20  8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
  3 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand,
	Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin,
	Wei Yang, Dan Williams, Qian Cai

Let's drop the now unused functions.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memory_hotplug.h |  3 ---
 mm/memory_hotplug.c            | 12 ------------
 2 files changed, 15 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 71a620eabb62..933f2bb3bdbb 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -106,9 +106,6 @@ extern void generic_online_page(struct page *page, unsigned int order);
 extern int set_online_page_callback(online_page_callback_t callback);
 extern int restore_online_page_callback(online_page_callback_t callback);
 
-extern void __online_page_increment_counters(struct page *page);
-extern void __online_page_free(struct page *page);
-
 extern int try_online_node(int nid);
 
 extern int arch_add_memory(int nid, u64 start, u64 size,
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index f32a5feaf7ff..16dd5b1498e8 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -602,18 +602,6 @@ int restore_online_page_callback(online_page_callback_t callback)
 }
 EXPORT_SYMBOL_GPL(restore_online_page_callback);
 
-void __online_page_increment_counters(struct page *page)
-{
-	adjust_managed_page_count(page, 1);
-}
-EXPORT_SYMBOL_GPL(__online_page_increment_counters);
-
-void __online_page_free(struct page *page)
-{
-	__free_reserved_page(page);
-}
-EXPORT_SYMBOL_GPL(__online_page_free);
-
 void generic_online_page(struct page *page, unsigned int order)
 {
 	kernel_map_pages(page, 1 << order, 1);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand
@ 2019-09-20  8:17 ` David Hildenbrand
  2019-09-23  8:58   ` Michal Hocko
  3 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-09-20  8:17 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton,
	Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Michal Hocko,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On 09.09.19 13:48, David Hildenbrand wrote:
> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
> 
> Let's replace the __online_page...() functions by generic_online_page().
> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
> can simpy re-use the generic function.
> 
> Only compile-tested.
> 
> Cc: Souptick Joarder <jrdr.linux@gmail.com>
> 
> David Hildenbrand (3):
>   mm/memory_hotplug: Export generic_online_page()
>   hv_balloon: Use generic_online_page()
>   mm/memory_hotplug: Remove __online_page_free() and
>     __online_page_increment_counters()
> 
>  drivers/hv/hv_balloon.c        |  3 +--
>  include/linux/memory_hotplug.h |  4 +---
>  mm/memory_hotplug.c            | 17 ++---------------
>  3 files changed, 4 insertions(+), 20 deletions(-)
> 

Ping, any comments on this one?

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-20  8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
@ 2019-09-23  8:58   ` Michal Hocko
  2019-09-23  9:31     ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-09-23  8:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On Fri 20-09-19 10:17:54, David Hildenbrand wrote:
> On 09.09.19 13:48, David Hildenbrand wrote:
> > Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
> > 
> > Let's replace the __online_page...() functions by generic_online_page().
> > Hyper-V only wants to delay the actual onlining of un-backed pages, so we
> > can simpy re-use the generic function.
> > 
> > Only compile-tested.
> > 
> > Cc: Souptick Joarder <jrdr.linux@gmail.com>
> > 
> > David Hildenbrand (3):
> >   mm/memory_hotplug: Export generic_online_page()
> >   hv_balloon: Use generic_online_page()
> >   mm/memory_hotplug: Remove __online_page_free() and
> >     __online_page_increment_counters()
> > 
> >  drivers/hv/hv_balloon.c        |  3 +--
> >  include/linux/memory_hotplug.h |  4 +---
> >  mm/memory_hotplug.c            | 17 ++---------------
> >  3 files changed, 4 insertions(+), 20 deletions(-)
> > 
> 
> Ping, any comments on this one?

Unification makes a lot of sense to me. You can add
Acked-by: Michal Hocko <mhocko@suse.com>

I will most likely won't surprise if I asked for more here though ;)
I have to confess I really detest the whole concept of a hidden callback
with a very weird API. Is this something we can do about? I do realize
that adding a callback would require either cluttering the existing APIs
but maybe we can come up with something more clever. Or maybe existing
external users of online callback can do that as a separate step after
the online is completed - or is this impossible due to locking
guarantees?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23  8:58   ` Michal Hocko
@ 2019-09-23  9:31     ` David Hildenbrand
  2019-09-23 11:15       ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-09-23  9:31 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On 23.09.19 10:58, Michal Hocko wrote:
> On Fri 20-09-19 10:17:54, David Hildenbrand wrote:
>> On 09.09.19 13:48, David Hildenbrand wrote:
>>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
>>>
>>> Let's replace the __online_page...() functions by generic_online_page().
>>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
>>> can simpy re-use the generic function.
>>>
>>> Only compile-tested.
>>>
>>> Cc: Souptick Joarder <jrdr.linux@gmail.com>
>>>
>>> David Hildenbrand (3):
>>>   mm/memory_hotplug: Export generic_online_page()
>>>   hv_balloon: Use generic_online_page()
>>>   mm/memory_hotplug: Remove __online_page_free() and
>>>     __online_page_increment_counters()
>>>
>>>  drivers/hv/hv_balloon.c        |  3 +--
>>>  include/linux/memory_hotplug.h |  4 +---
>>>  mm/memory_hotplug.c            | 17 ++---------------
>>>  3 files changed, 4 insertions(+), 20 deletions(-)
>>>
>>
>> Ping, any comments on this one?
> 
> Unification makes a lot of sense to me. You can add
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> I will most likely won't surprise if I asked for more here though ;)

I'm not surprised, but definitely not in a negative sense ;) I was
asking myself if we could somehow rework this, too.

> I have to confess I really detest the whole concept of a hidden callback
> with a very weird API. Is this something we can do about? I do realize
> that adding a callback would require either cluttering the existing APIs
> but maybe we can come up with something more clever. Or maybe existing
> external users of online callback can do that as a separate step after
> the online is completed - or is this impossible due to locking
> guarantees?
> 

The use case of this (somewhat special) callback really is to avoid
selected (unbacked in the hypervisor) pages to get put to the buddy just
now, but instead to defer that (sometimes, defer till infinity ;) ).
Especially, to hinder these pages from getting touched at all. Pages
that won't be put to the buddy will usually get PG_offline set (e.g.,
Hyper-V and XEN) - the only two users I am aware of.

For Hyper-V (and also eventually virtio-mem), it is important to set
PG_offline before marking the section to be online (SECTION_IS_ONLINE).
Only this way, PG_offline is properly set on all pfn_to_online_page()
pages, meaning "don't touch this page" - e.g., used to skip over such
pages when suspending or by makedumpfile to skip over such offline pages
when creating a memory dump.

So if we would e.g., try to piggy-back onto the memory_notify()
infrastructure, we could
1. Online all pages to the buddy (dropping the callback)
2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg);
-> in the notifier, pull pages from the buddy, mark sections online
3. Set all involved sections online (online_mem_sections())

However, I am not sure what actually happens after 1. - we are only
holding the device hotplug lock and the memory hotplug lock, so the
pages can just get allocated. Also, it sounds like more work and code
for the same end result (okay, if the rework is really necessary, though).

So yeah, while the current callback might not be optimal, I don't see an
easy and clean way to rework this. With the change in this series we are
at least able to simply defer doing what would have been done without
the callback - not perfect but better.

Do you have anything in mind that could work out and make this nicer?

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23  9:31     ` David Hildenbrand
@ 2019-09-23 11:15       ` Michal Hocko
  2019-09-23 11:34         ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-09-23 11:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On Mon 23-09-19 11:31:30, David Hildenbrand wrote:
> On 23.09.19 10:58, Michal Hocko wrote:
> > On Fri 20-09-19 10:17:54, David Hildenbrand wrote:
> >> On 09.09.19 13:48, David Hildenbrand wrote:
> >>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
> >>>
> >>> Let's replace the __online_page...() functions by generic_online_page().
> >>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
> >>> can simpy re-use the generic function.
> >>>
> >>> Only compile-tested.
> >>>
> >>> Cc: Souptick Joarder <jrdr.linux@gmail.com>
> >>>
> >>> David Hildenbrand (3):
> >>>   mm/memory_hotplug: Export generic_online_page()
> >>>   hv_balloon: Use generic_online_page()
> >>>   mm/memory_hotplug: Remove __online_page_free() and
> >>>     __online_page_increment_counters()
> >>>
> >>>  drivers/hv/hv_balloon.c        |  3 +--
> >>>  include/linux/memory_hotplug.h |  4 +---
> >>>  mm/memory_hotplug.c            | 17 ++---------------
> >>>  3 files changed, 4 insertions(+), 20 deletions(-)
> >>>
> >>
> >> Ping, any comments on this one?
> > 
> > Unification makes a lot of sense to me. You can add
> > Acked-by: Michal Hocko <mhocko@suse.com>
> > 
> > I will most likely won't surprise if I asked for more here though ;)
> 
> I'm not surprised, but definitely not in a negative sense ;) I was
> asking myself if we could somehow rework this, too.
> 
> > I have to confess I really detest the whole concept of a hidden callback
> > with a very weird API. Is this something we can do about? I do realize
> > that adding a callback would require either cluttering the existing APIs
> > but maybe we can come up with something more clever. Or maybe existing
> > external users of online callback can do that as a separate step after
> > the online is completed - or is this impossible due to locking
> > guarantees?
> > 
> 
> The use case of this (somewhat special) callback really is to avoid
> selected (unbacked in the hypervisor) pages to get put to the buddy just
> now, but instead to defer that (sometimes, defer till infinity ;) ).
> Especially, to hinder these pages from getting touched at all. Pages
> that won't be put to the buddy will usually get PG_offline set (e.g.,
> Hyper-V and XEN) - the only two users I am aware of.
> 
> For Hyper-V (and also eventually virtio-mem), it is important to set
> PG_offline before marking the section to be online (SECTION_IS_ONLINE).
> Only this way, PG_offline is properly set on all pfn_to_online_page()
> pages, meaning "don't touch this page" - e.g., used to skip over such
> pages when suspending or by makedumpfile to skip over such offline pages
> when creating a memory dump.

Thanks for the clarification. I have never really studied what those
callbacks are doing really.

> So if we would e.g., try to piggy-back onto the memory_notify()
> infrastructure, we could
> 1. Online all pages to the buddy (dropping the callback)
> 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg);
> -> in the notifier, pull pages from the buddy, mark sections online
> 3. Set all involved sections online (online_mem_sections())

This doesn't really sound any better. For one pages are immediately
usable when they hit the buddy allocator so this is racy and thus not
reliable.

> However, I am not sure what actually happens after 1. - we are only
> holding the device hotplug lock and the memory hotplug lock, so the
> pages can just get allocated. Also, it sounds like more work and code
> for the same end result (okay, if the rework is really necessary, though).
> 
> So yeah, while the current callback might not be optimal, I don't see an
> easy and clean way to rework this. With the change in this series we are
> at least able to simply defer doing what would have been done without
> the callback - not perfect but better.
> 
> Do you have anything in mind that could work out and make this nicer?

I am wondering why those pages get onlined when they are, in fact,
supposed to be offline.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23 11:15       ` Michal Hocko
@ 2019-09-23 11:34         ` David Hildenbrand
  2019-09-23 11:43           ` David Hildenbrand
  2019-09-23 12:07           ` Michal Hocko
  0 siblings, 2 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-23 11:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On 23.09.19 13:15, Michal Hocko wrote:
> On Mon 23-09-19 11:31:30, David Hildenbrand wrote:
>> On 23.09.19 10:58, Michal Hocko wrote:
>>> On Fri 20-09-19 10:17:54, David Hildenbrand wrote:
>>>> On 09.09.19 13:48, David Hildenbrand wrote:
>>>>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
>>>>>
>>>>> Let's replace the __online_page...() functions by generic_online_page().
>>>>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
>>>>> can simpy re-use the generic function.
>>>>>
>>>>> Only compile-tested.
>>>>>
>>>>> Cc: Souptick Joarder <jrdr.linux@gmail.com>
>>>>>
>>>>> David Hildenbrand (3):
>>>>>   mm/memory_hotplug: Export generic_online_page()
>>>>>   hv_balloon: Use generic_online_page()
>>>>>   mm/memory_hotplug: Remove __online_page_free() and
>>>>>     __online_page_increment_counters()
>>>>>
>>>>>  drivers/hv/hv_balloon.c        |  3 +--
>>>>>  include/linux/memory_hotplug.h |  4 +---
>>>>>  mm/memory_hotplug.c            | 17 ++---------------
>>>>>  3 files changed, 4 insertions(+), 20 deletions(-)
>>>>>
>>>>
>>>> Ping, any comments on this one?
>>>
>>> Unification makes a lot of sense to me. You can add
>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>
>>> I will most likely won't surprise if I asked for more here though ;)
>>
>> I'm not surprised, but definitely not in a negative sense ;) I was
>> asking myself if we could somehow rework this, too.
>>
>>> I have to confess I really detest the whole concept of a hidden callback
>>> with a very weird API. Is this something we can do about? I do realize
>>> that adding a callback would require either cluttering the existing APIs
>>> but maybe we can come up with something more clever. Or maybe existing
>>> external users of online callback can do that as a separate step after
>>> the online is completed - or is this impossible due to locking
>>> guarantees?
>>>
>>
>> The use case of this (somewhat special) callback really is to avoid
>> selected (unbacked in the hypervisor) pages to get put to the buddy just
>> now, but instead to defer that (sometimes, defer till infinity ;) ).
>> Especially, to hinder these pages from getting touched at all. Pages
>> that won't be put to the buddy will usually get PG_offline set (e.g.,
>> Hyper-V and XEN) - the only two users I am aware of.
>>
>> For Hyper-V (and also eventually virtio-mem), it is important to set
>> PG_offline before marking the section to be online (SECTION_IS_ONLINE).
>> Only this way, PG_offline is properly set on all pfn_to_online_page()
>> pages, meaning "don't touch this page" - e.g., used to skip over such
>> pages when suspending or by makedumpfile to skip over such offline pages
>> when creating a memory dump.
> 
> Thanks for the clarification. I have never really studied what those
> callbacks are doing really.
> 
>> So if we would e.g., try to piggy-back onto the memory_notify()
>> infrastructure, we could
>> 1. Online all pages to the buddy (dropping the callback)
>> 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg);
>> -> in the notifier, pull pages from the buddy, mark sections online
>> 3. Set all involved sections online (online_mem_sections())
> 
> This doesn't really sound any better. For one pages are immediately
> usable when they hit the buddy allocator so this is racy and thus not
> reliable.
> 
>> However, I am not sure what actually happens after 1. - we are only
>> holding the device hotplug lock and the memory hotplug lock, so the
>> pages can just get allocated. Also, it sounds like more work and code
>> for the same end result (okay, if the rework is really necessary, though).
>>
>> So yeah, while the current callback might not be optimal, I don't see an
>> easy and clean way to rework this. With the change in this series we are
>> at least able to simply defer doing what would have been done without
>> the callback - not perfect but better.
>>
>> Do you have anything in mind that could work out and make this nicer?
> 
> I am wondering why those pages get onlined when they are, in fact,
> supposed to be offline.
> 

It's the current way of emulating sub-memory-block hotplug on top of the
memory bock device API we have. Hyper-V and XEN have been using that for
a long time.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23 11:34         ` David Hildenbrand
@ 2019-09-23 11:43           ` David Hildenbrand
  2019-09-23 12:07           ` Michal Hocko
  1 sibling, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-23 11:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On 23.09.19 13:34, David Hildenbrand wrote:
> On 23.09.19 13:15, Michal Hocko wrote:
>> On Mon 23-09-19 11:31:30, David Hildenbrand wrote:
>>> On 23.09.19 10:58, Michal Hocko wrote:
>>>> On Fri 20-09-19 10:17:54, David Hildenbrand wrote:
>>>>> On 09.09.19 13:48, David Hildenbrand wrote:
>>>>>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()"
>>>>>>
>>>>>> Let's replace the __online_page...() functions by generic_online_page().
>>>>>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we
>>>>>> can simpy re-use the generic function.
>>>>>>
>>>>>> Only compile-tested.
>>>>>>
>>>>>> Cc: Souptick Joarder <jrdr.linux@gmail.com>
>>>>>>
>>>>>> David Hildenbrand (3):
>>>>>>   mm/memory_hotplug: Export generic_online_page()
>>>>>>   hv_balloon: Use generic_online_page()
>>>>>>   mm/memory_hotplug: Remove __online_page_free() and
>>>>>>     __online_page_increment_counters()
>>>>>>
>>>>>>  drivers/hv/hv_balloon.c        |  3 +--
>>>>>>  include/linux/memory_hotplug.h |  4 +---
>>>>>>  mm/memory_hotplug.c            | 17 ++---------------
>>>>>>  3 files changed, 4 insertions(+), 20 deletions(-)
>>>>>>
>>>>>
>>>>> Ping, any comments on this one?
>>>>
>>>> Unification makes a lot of sense to me. You can add
>>>> Acked-by: Michal Hocko <mhocko@suse.com>
>>>>
>>>> I will most likely won't surprise if I asked for more here though ;)
>>>
>>> I'm not surprised, but definitely not in a negative sense ;) I was
>>> asking myself if we could somehow rework this, too.
>>>
>>>> I have to confess I really detest the whole concept of a hidden callback
>>>> with a very weird API. Is this something we can do about? I do realize
>>>> that adding a callback would require either cluttering the existing APIs
>>>> but maybe we can come up with something more clever. Or maybe existing
>>>> external users of online callback can do that as a separate step after
>>>> the online is completed - or is this impossible due to locking
>>>> guarantees?
>>>>
>>>
>>> The use case of this (somewhat special) callback really is to avoid
>>> selected (unbacked in the hypervisor) pages to get put to the buddy just
>>> now, but instead to defer that (sometimes, defer till infinity ;) ).
>>> Especially, to hinder these pages from getting touched at all. Pages
>>> that won't be put to the buddy will usually get PG_offline set (e.g.,
>>> Hyper-V and XEN) - the only two users I am aware of.
>>>
>>> For Hyper-V (and also eventually virtio-mem), it is important to set
>>> PG_offline before marking the section to be online (SECTION_IS_ONLINE).
>>> Only this way, PG_offline is properly set on all pfn_to_online_page()
>>> pages, meaning "don't touch this page" - e.g., used to skip over such
>>> pages when suspending or by makedumpfile to skip over such offline pages
>>> when creating a memory dump.
>>
>> Thanks for the clarification. I have never really studied what those
>> callbacks are doing really.
>>
>>> So if we would e.g., try to piggy-back onto the memory_notify()
>>> infrastructure, we could
>>> 1. Online all pages to the buddy (dropping the callback)
>>> 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg);
>>> -> in the notifier, pull pages from the buddy, mark sections online
>>> 3. Set all involved sections online (online_mem_sections())
>>
>> This doesn't really sound any better. For one pages are immediately
>> usable when they hit the buddy allocator so this is racy and thus not
>> reliable.
>>
>>> However, I am not sure what actually happens after 1. - we are only
>>> holding the device hotplug lock and the memory hotplug lock, so the
>>> pages can just get allocated. Also, it sounds like more work and code
>>> for the same end result (okay, if the rework is really necessary, though).
>>>
>>> So yeah, while the current callback might not be optimal, I don't see an
>>> easy and clean way to rework this. With the change in this series we are
>>> at least able to simply defer doing what would have been done without
>>> the callback - not perfect but better.
>>>
>>> Do you have anything in mind that could work out and make this nicer?
>>
>> I am wondering why those pages get onlined when they are, in fact,
>> supposed to be offline.
>>
> 
> It's the current way of emulating sub-memory-block hotplug on top of the
> memory bock device API we have. Hyper-V and XEN have been using that for
> a long time.
> 

So one idea would be to let clients set pages to PG_offline during
MEM_GOING_ONLINE. We could then skip any PG_offline pages when onlining
pages, not onlining them to the buddy.

But then, there still has to be a way to online pages when required -
e.g., generic_online_page(). At least the callback could go.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23 11:34         ` David Hildenbrand
  2019-09-23 11:43           ` David Hildenbrand
@ 2019-09-23 12:07           ` Michal Hocko
  2019-09-23 12:20             ` David Hildenbrand
  1 sibling, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-09-23 12:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On Mon 23-09-19 13:34:18, David Hildenbrand wrote:
> On 23.09.19 13:15, Michal Hocko wrote:
[...]
> > I am wondering why those pages get onlined when they are, in fact,
> > supposed to be offline.
> > 
> 
> It's the current way of emulating sub-memory-block hotplug on top of the
> memory bock device API we have. Hyper-V and XEN have been using that for
> a long time.

Do they really have to use the existing block interface when they in
fact do not operate on the block granularity? Zone device memory already
acts on sub section/block boundaries.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23 12:07           ` Michal Hocko
@ 2019-09-23 12:20             ` David Hildenbrand
  2019-09-23 12:30               ` Michal Hocko
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2019-09-23 12:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On 23.09.19 14:07, Michal Hocko wrote:
> On Mon 23-09-19 13:34:18, David Hildenbrand wrote:
>> On 23.09.19 13:15, Michal Hocko wrote:
> [...]
>>> I am wondering why those pages get onlined when they are, in fact,
>>> supposed to be offline.
>>>
>>
>> It's the current way of emulating sub-memory-block hotplug on top of the
>> memory bock device API we have. Hyper-V and XEN have been using that for
>> a long time.
> 
> Do they really have to use the existing block interface when they in
> fact do not operate on the block granularity? Zone device memory already
> acts on sub section/block boundaries.
> 

Yes, we need memory blocks, especially for user space to properly online
them (as we discussed a while back, to decide on a zone) and for udev
events, to e.g., properly reload kexec when memory blocks get
added/removed/onlined/offlined.

ZONE_DEVICE is special as it doesn't have to worry about
onlining/offlining/dumping. Also, it doesn't care about
SECTION_IS_ONLINE, but figuring out how to detect which sub
sections/blocks have a valid memmap is still something to get fixed and
is still getting discussed on MM.

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23 12:20             ` David Hildenbrand
@ 2019-09-23 12:30               ` Michal Hocko
  2019-09-23 12:34                 ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Michal Hocko @ 2019-09-23 12:30 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On Mon 23-09-19 14:20:05, David Hildenbrand wrote:
> On 23.09.19 14:07, Michal Hocko wrote:
> > On Mon 23-09-19 13:34:18, David Hildenbrand wrote:
> >> On 23.09.19 13:15, Michal Hocko wrote:
> > [...]
> >>> I am wondering why those pages get onlined when they are, in fact,
> >>> supposed to be offline.
> >>>
> >>
> >> It's the current way of emulating sub-memory-block hotplug on top of the
> >> memory bock device API we have. Hyper-V and XEN have been using that for
> >> a long time.
> > 
> > Do they really have to use the existing block interface when they in
> > fact do not operate on the block granularity? Zone device memory already
> > acts on sub section/block boundaries.
> > 
> 
> Yes, we need memory blocks, especially for user space to properly online
> them (as we discussed a while back, to decide on a zone) and for udev
> events, to e.g., properly reload kexec when memory blocks get
> added/removed/onlined/offlined.

Just to make sure I really follow. We need a user interface to control
where the memory gets onlined but it is the driver which determines
which part of the block really gets onlined, right?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page()
  2019-09-23 12:30               ` Michal Hocko
@ 2019-09-23 12:34                 ` David Hildenbrand
  0 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2019-09-23 12:34 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv,
	Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan,
	Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin,
	Stephen Hemminger, Wei Yang

On 23.09.19 14:30, Michal Hocko wrote:
> On Mon 23-09-19 14:20:05, David Hildenbrand wrote:
>> On 23.09.19 14:07, Michal Hocko wrote:
>>> On Mon 23-09-19 13:34:18, David Hildenbrand wrote:
>>>> On 23.09.19 13:15, Michal Hocko wrote:
>>> [...]
>>>>> I am wondering why those pages get onlined when they are, in fact,
>>>>> supposed to be offline.
>>>>>
>>>>
>>>> It's the current way of emulating sub-memory-block hotplug on top of the
>>>> memory bock device API we have. Hyper-V and XEN have been using that for
>>>> a long time.
>>>
>>> Do they really have to use the existing block interface when they in
>>> fact do not operate on the block granularity? Zone device memory already
>>> acts on sub section/block boundaries.
>>>
>>
>> Yes, we need memory blocks, especially for user space to properly online
>> them (as we discussed a while back, to decide on a zone) and for udev
>> events, to e.g., properly reload kexec when memory blocks get
>> added/removed/onlined/offlined.
> 
> Just to make sure I really follow. We need a user interface to control
> where the memory gets onlined but it is the driver which determines
> which part of the block really gets onlined, right?
> 

Yes, it's the drivers job to decide which part of the memory block can
actually get used, and when.

It's a little bit like the driver immediately allocates unbacked memory
again, to free that memory to the buddy once requested. (similar to how
ballooning works).

-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-09-23 12:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand
2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand
2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand
2019-09-20  8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand
2019-09-23  8:58   ` Michal Hocko
2019-09-23  9:31     ` David Hildenbrand
2019-09-23 11:15       ` Michal Hocko
2019-09-23 11:34         ` David Hildenbrand
2019-09-23 11:43           ` David Hildenbrand
2019-09-23 12:07           ` Michal Hocko
2019-09-23 12:20             ` David Hildenbrand
2019-09-23 12:30               ` Michal Hocko
2019-09-23 12:34                 ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).