* [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() @ 2019-09-09 11:48 David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Michal Hocko, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" Let's replace the __online_page...() functions by generic_online_page(). Hyper-V only wants to delay the actual onlining of un-backed pages, so we can simpy re-use the generic function. Only compile-tested. Cc: Souptick Joarder <jrdr.linux@gmail.com> David Hildenbrand (3): mm/memory_hotplug: Export generic_online_page() hv_balloon: Use generic_online_page() mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() drivers/hv/hv_balloon.c | 3 +-- include/linux/memory_hotplug.h | 4 +--- mm/memory_hotplug.c | 17 ++--------------- 3 files changed, 4 insertions(+), 20 deletions(-) -- 2.21.0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v1 1/3] mm/memory_hotplug: Export generic_online_page() 2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand @ 2019-09-09 11:48 ` David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand ` (2 subsequent siblings) 3 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand, Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin, Dan Williams, Wei Yang, Qian Cai Let's expose generic_online_page() so online_page_callback users can simply fallback to the generic implementation when actually deciding to online the pages. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Qian Cai <cai@lca.pw> Signed-off-by: David Hildenbrand <david@redhat.com> --- include/linux/memory_hotplug.h | 1 + mm/memory_hotplug.c | 5 ++--- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 8ee3a2ae5131..71a620eabb62 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -102,6 +102,7 @@ extern unsigned long __offline_isolated_pages(unsigned long start_pfn, typedef void (*online_page_callback_t)(struct page *page, unsigned int order); +extern void generic_online_page(struct page *page, unsigned int order); extern int set_online_page_callback(online_page_callback_t callback); extern int restore_online_page_callback(online_page_callback_t callback); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 2f0d2908e235..f32a5feaf7ff 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -49,8 +49,6 @@ * and restore_online_page_callback() for generic callback restore. */ -static void generic_online_page(struct page *page, unsigned int order); - static online_page_callback_t online_page_callback = generic_online_page; static DEFINE_MUTEX(online_page_callback_lock); @@ -616,7 +614,7 @@ void __online_page_free(struct page *page) } EXPORT_SYMBOL_GPL(__online_page_free); -static void generic_online_page(struct page *page, unsigned int order) +void generic_online_page(struct page *page, unsigned int order) { kernel_map_pages(page, 1 << order, 1); __free_pages_core(page, order); @@ -626,6 +624,7 @@ static void generic_online_page(struct page *page, unsigned int order) totalhigh_pages_add(1UL << order); #endif } +EXPORT_SYMBOL_GPL(generic_online_page); static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, void *arg) -- 2.21.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v1 2/3] hv_balloon: Use generic_online_page() 2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand @ 2019-09-09 11:48 ` David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand 2019-09-20 8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand 3 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger, Sasha Levin Let's use the generic onlining function - which will now also take care of calling kernel_map_pages(). Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/hv/hv_balloon.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index a91c90d4402c..35f123b459c8 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -680,8 +680,7 @@ static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg) __ClearPageOffline(pg); /* This frame is currently backed; online the page. */ - __online_page_increment_counters(pg); - __online_page_free(pg); + generic_online_page(pg, 0); lockdep_assert_held(&dm_device.ha_lock); dm_device.num_pages_onlined++; -- 2.21.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() 2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand @ 2019-09-09 11:48 ` David Hildenbrand 2019-09-20 8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand 3 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-09 11:48 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, Souptick Joarder, linux-hyperv, David Hildenbrand, Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Dan Williams, Qian Cai Let's drop the now unused functions. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Qian Cai <cai@lca.pw> Signed-off-by: David Hildenbrand <david@redhat.com> --- include/linux/memory_hotplug.h | 3 --- mm/memory_hotplug.c | 12 ------------ 2 files changed, 15 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 71a620eabb62..933f2bb3bdbb 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -106,9 +106,6 @@ extern void generic_online_page(struct page *page, unsigned int order); extern int set_online_page_callback(online_page_callback_t callback); extern int restore_online_page_callback(online_page_callback_t callback); -extern void __online_page_increment_counters(struct page *page); -extern void __online_page_free(struct page *page); - extern int try_online_node(int nid); extern int arch_add_memory(int nid, u64 start, u64 size, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index f32a5feaf7ff..16dd5b1498e8 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -602,18 +602,6 @@ int restore_online_page_callback(online_page_callback_t callback) } EXPORT_SYMBOL_GPL(restore_online_page_callback); -void __online_page_increment_counters(struct page *page) -{ - adjust_managed_page_count(page, 1); -} -EXPORT_SYMBOL_GPL(__online_page_increment_counters); - -void __online_page_free(struct page *page) -{ - __free_reserved_page(page); -} -EXPORT_SYMBOL_GPL(__online_page_free); - void generic_online_page(struct page *page, unsigned int order) { kernel_map_pages(page, 1 << order, 1); -- 2.21.0 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand ` (2 preceding siblings ...) 2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand @ 2019-09-20 8:17 ` David Hildenbrand 2019-09-23 8:58 ` Michal Hocko 3 siblings, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2019-09-20 8:17 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Michal Hocko, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On 09.09.19 13:48, David Hildenbrand wrote: > Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" > > Let's replace the __online_page...() functions by generic_online_page(). > Hyper-V only wants to delay the actual onlining of un-backed pages, so we > can simpy re-use the generic function. > > Only compile-tested. > > Cc: Souptick Joarder <jrdr.linux@gmail.com> > > David Hildenbrand (3): > mm/memory_hotplug: Export generic_online_page() > hv_balloon: Use generic_online_page() > mm/memory_hotplug: Remove __online_page_free() and > __online_page_increment_counters() > > drivers/hv/hv_balloon.c | 3 +-- > include/linux/memory_hotplug.h | 4 +--- > mm/memory_hotplug.c | 17 ++--------------- > 3 files changed, 4 insertions(+), 20 deletions(-) > Ping, any comments on this one? -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-20 8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand @ 2019-09-23 8:58 ` Michal Hocko 2019-09-23 9:31 ` David Hildenbrand 0 siblings, 1 reply; 14+ messages in thread From: Michal Hocko @ 2019-09-23 8:58 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On Fri 20-09-19 10:17:54, David Hildenbrand wrote: > On 09.09.19 13:48, David Hildenbrand wrote: > > Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" > > > > Let's replace the __online_page...() functions by generic_online_page(). > > Hyper-V only wants to delay the actual onlining of un-backed pages, so we > > can simpy re-use the generic function. > > > > Only compile-tested. > > > > Cc: Souptick Joarder <jrdr.linux@gmail.com> > > > > David Hildenbrand (3): > > mm/memory_hotplug: Export generic_online_page() > > hv_balloon: Use generic_online_page() > > mm/memory_hotplug: Remove __online_page_free() and > > __online_page_increment_counters() > > > > drivers/hv/hv_balloon.c | 3 +-- > > include/linux/memory_hotplug.h | 4 +--- > > mm/memory_hotplug.c | 17 ++--------------- > > 3 files changed, 4 insertions(+), 20 deletions(-) > > > > Ping, any comments on this one? Unification makes a lot of sense to me. You can add Acked-by: Michal Hocko <mhocko@suse.com> I will most likely won't surprise if I asked for more here though ;) I have to confess I really detest the whole concept of a hidden callback with a very weird API. Is this something we can do about? I do realize that adding a callback would require either cluttering the existing APIs but maybe we can come up with something more clever. Or maybe existing external users of online callback can do that as a separate step after the online is completed - or is this impossible due to locking guarantees? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 8:58 ` Michal Hocko @ 2019-09-23 9:31 ` David Hildenbrand 2019-09-23 11:15 ` Michal Hocko 0 siblings, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2019-09-23 9:31 UTC (permalink / raw) To: Michal Hocko Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On 23.09.19 10:58, Michal Hocko wrote: > On Fri 20-09-19 10:17:54, David Hildenbrand wrote: >> On 09.09.19 13:48, David Hildenbrand wrote: >>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" >>> >>> Let's replace the __online_page...() functions by generic_online_page(). >>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we >>> can simpy re-use the generic function. >>> >>> Only compile-tested. >>> >>> Cc: Souptick Joarder <jrdr.linux@gmail.com> >>> >>> David Hildenbrand (3): >>> mm/memory_hotplug: Export generic_online_page() >>> hv_balloon: Use generic_online_page() >>> mm/memory_hotplug: Remove __online_page_free() and >>> __online_page_increment_counters() >>> >>> drivers/hv/hv_balloon.c | 3 +-- >>> include/linux/memory_hotplug.h | 4 +--- >>> mm/memory_hotplug.c | 17 ++--------------- >>> 3 files changed, 4 insertions(+), 20 deletions(-) >>> >> >> Ping, any comments on this one? > > Unification makes a lot of sense to me. You can add > Acked-by: Michal Hocko <mhocko@suse.com> > > I will most likely won't surprise if I asked for more here though ;) I'm not surprised, but definitely not in a negative sense ;) I was asking myself if we could somehow rework this, too. > I have to confess I really detest the whole concept of a hidden callback > with a very weird API. Is this something we can do about? I do realize > that adding a callback would require either cluttering the existing APIs > but maybe we can come up with something more clever. Or maybe existing > external users of online callback can do that as a separate step after > the online is completed - or is this impossible due to locking > guarantees? > The use case of this (somewhat special) callback really is to avoid selected (unbacked in the hypervisor) pages to get put to the buddy just now, but instead to defer that (sometimes, defer till infinity ;) ). Especially, to hinder these pages from getting touched at all. Pages that won't be put to the buddy will usually get PG_offline set (e.g., Hyper-V and XEN) - the only two users I am aware of. For Hyper-V (and also eventually virtio-mem), it is important to set PG_offline before marking the section to be online (SECTION_IS_ONLINE). Only this way, PG_offline is properly set on all pfn_to_online_page() pages, meaning "don't touch this page" - e.g., used to skip over such pages when suspending or by makedumpfile to skip over such offline pages when creating a memory dump. So if we would e.g., try to piggy-back onto the memory_notify() infrastructure, we could 1. Online all pages to the buddy (dropping the callback) 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg); -> in the notifier, pull pages from the buddy, mark sections online 3. Set all involved sections online (online_mem_sections()) However, I am not sure what actually happens after 1. - we are only holding the device hotplug lock and the memory hotplug lock, so the pages can just get allocated. Also, it sounds like more work and code for the same end result (okay, if the rework is really necessary, though). So yeah, while the current callback might not be optimal, I don't see an easy and clean way to rework this. With the change in this series we are at least able to simply defer doing what would have been done without the callback - not perfect but better. Do you have anything in mind that could work out and make this nicer? -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 9:31 ` David Hildenbrand @ 2019-09-23 11:15 ` Michal Hocko 2019-09-23 11:34 ` David Hildenbrand 0 siblings, 1 reply; 14+ messages in thread From: Michal Hocko @ 2019-09-23 11:15 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On Mon 23-09-19 11:31:30, David Hildenbrand wrote: > On 23.09.19 10:58, Michal Hocko wrote: > > On Fri 20-09-19 10:17:54, David Hildenbrand wrote: > >> On 09.09.19 13:48, David Hildenbrand wrote: > >>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" > >>> > >>> Let's replace the __online_page...() functions by generic_online_page(). > >>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we > >>> can simpy re-use the generic function. > >>> > >>> Only compile-tested. > >>> > >>> Cc: Souptick Joarder <jrdr.linux@gmail.com> > >>> > >>> David Hildenbrand (3): > >>> mm/memory_hotplug: Export generic_online_page() > >>> hv_balloon: Use generic_online_page() > >>> mm/memory_hotplug: Remove __online_page_free() and > >>> __online_page_increment_counters() > >>> > >>> drivers/hv/hv_balloon.c | 3 +-- > >>> include/linux/memory_hotplug.h | 4 +--- > >>> mm/memory_hotplug.c | 17 ++--------------- > >>> 3 files changed, 4 insertions(+), 20 deletions(-) > >>> > >> > >> Ping, any comments on this one? > > > > Unification makes a lot of sense to me. You can add > > Acked-by: Michal Hocko <mhocko@suse.com> > > > > I will most likely won't surprise if I asked for more here though ;) > > I'm not surprised, but definitely not in a negative sense ;) I was > asking myself if we could somehow rework this, too. > > > I have to confess I really detest the whole concept of a hidden callback > > with a very weird API. Is this something we can do about? I do realize > > that adding a callback would require either cluttering the existing APIs > > but maybe we can come up with something more clever. Or maybe existing > > external users of online callback can do that as a separate step after > > the online is completed - or is this impossible due to locking > > guarantees? > > > > The use case of this (somewhat special) callback really is to avoid > selected (unbacked in the hypervisor) pages to get put to the buddy just > now, but instead to defer that (sometimes, defer till infinity ;) ). > Especially, to hinder these pages from getting touched at all. Pages > that won't be put to the buddy will usually get PG_offline set (e.g., > Hyper-V and XEN) - the only two users I am aware of. > > For Hyper-V (and also eventually virtio-mem), it is important to set > PG_offline before marking the section to be online (SECTION_IS_ONLINE). > Only this way, PG_offline is properly set on all pfn_to_online_page() > pages, meaning "don't touch this page" - e.g., used to skip over such > pages when suspending or by makedumpfile to skip over such offline pages > when creating a memory dump. Thanks for the clarification. I have never really studied what those callbacks are doing really. > So if we would e.g., try to piggy-back onto the memory_notify() > infrastructure, we could > 1. Online all pages to the buddy (dropping the callback) > 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg); > -> in the notifier, pull pages from the buddy, mark sections online > 3. Set all involved sections online (online_mem_sections()) This doesn't really sound any better. For one pages are immediately usable when they hit the buddy allocator so this is racy and thus not reliable. > However, I am not sure what actually happens after 1. - we are only > holding the device hotplug lock and the memory hotplug lock, so the > pages can just get allocated. Also, it sounds like more work and code > for the same end result (okay, if the rework is really necessary, though). > > So yeah, while the current callback might not be optimal, I don't see an > easy and clean way to rework this. With the change in this series we are > at least able to simply defer doing what would have been done without > the callback - not perfect but better. > > Do you have anything in mind that could work out and make this nicer? I am wondering why those pages get onlined when they are, in fact, supposed to be offline. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 11:15 ` Michal Hocko @ 2019-09-23 11:34 ` David Hildenbrand 2019-09-23 11:43 ` David Hildenbrand 2019-09-23 12:07 ` Michal Hocko 0 siblings, 2 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-23 11:34 UTC (permalink / raw) To: Michal Hocko Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On 23.09.19 13:15, Michal Hocko wrote: > On Mon 23-09-19 11:31:30, David Hildenbrand wrote: >> On 23.09.19 10:58, Michal Hocko wrote: >>> On Fri 20-09-19 10:17:54, David Hildenbrand wrote: >>>> On 09.09.19 13:48, David Hildenbrand wrote: >>>>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" >>>>> >>>>> Let's replace the __online_page...() functions by generic_online_page(). >>>>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we >>>>> can simpy re-use the generic function. >>>>> >>>>> Only compile-tested. >>>>> >>>>> Cc: Souptick Joarder <jrdr.linux@gmail.com> >>>>> >>>>> David Hildenbrand (3): >>>>> mm/memory_hotplug: Export generic_online_page() >>>>> hv_balloon: Use generic_online_page() >>>>> mm/memory_hotplug: Remove __online_page_free() and >>>>> __online_page_increment_counters() >>>>> >>>>> drivers/hv/hv_balloon.c | 3 +-- >>>>> include/linux/memory_hotplug.h | 4 +--- >>>>> mm/memory_hotplug.c | 17 ++--------------- >>>>> 3 files changed, 4 insertions(+), 20 deletions(-) >>>>> >>>> >>>> Ping, any comments on this one? >>> >>> Unification makes a lot of sense to me. You can add >>> Acked-by: Michal Hocko <mhocko@suse.com> >>> >>> I will most likely won't surprise if I asked for more here though ;) >> >> I'm not surprised, but definitely not in a negative sense ;) I was >> asking myself if we could somehow rework this, too. >> >>> I have to confess I really detest the whole concept of a hidden callback >>> with a very weird API. Is this something we can do about? I do realize >>> that adding a callback would require either cluttering the existing APIs >>> but maybe we can come up with something more clever. Or maybe existing >>> external users of online callback can do that as a separate step after >>> the online is completed - or is this impossible due to locking >>> guarantees? >>> >> >> The use case of this (somewhat special) callback really is to avoid >> selected (unbacked in the hypervisor) pages to get put to the buddy just >> now, but instead to defer that (sometimes, defer till infinity ;) ). >> Especially, to hinder these pages from getting touched at all. Pages >> that won't be put to the buddy will usually get PG_offline set (e.g., >> Hyper-V and XEN) - the only two users I am aware of. >> >> For Hyper-V (and also eventually virtio-mem), it is important to set >> PG_offline before marking the section to be online (SECTION_IS_ONLINE). >> Only this way, PG_offline is properly set on all pfn_to_online_page() >> pages, meaning "don't touch this page" - e.g., used to skip over such >> pages when suspending or by makedumpfile to skip over such offline pages >> when creating a memory dump. > > Thanks for the clarification. I have never really studied what those > callbacks are doing really. > >> So if we would e.g., try to piggy-back onto the memory_notify() >> infrastructure, we could >> 1. Online all pages to the buddy (dropping the callback) >> 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg); >> -> in the notifier, pull pages from the buddy, mark sections online >> 3. Set all involved sections online (online_mem_sections()) > > This doesn't really sound any better. For one pages are immediately > usable when they hit the buddy allocator so this is racy and thus not > reliable. > >> However, I am not sure what actually happens after 1. - we are only >> holding the device hotplug lock and the memory hotplug lock, so the >> pages can just get allocated. Also, it sounds like more work and code >> for the same end result (okay, if the rework is really necessary, though). >> >> So yeah, while the current callback might not be optimal, I don't see an >> easy and clean way to rework this. With the change in this series we are >> at least able to simply defer doing what would have been done without >> the callback - not perfect but better. >> >> Do you have anything in mind that could work out and make this nicer? > > I am wondering why those pages get onlined when they are, in fact, > supposed to be offline. > It's the current way of emulating sub-memory-block hotplug on top of the memory bock device API we have. Hyper-V and XEN have been using that for a long time. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 11:34 ` David Hildenbrand @ 2019-09-23 11:43 ` David Hildenbrand 2019-09-23 12:07 ` Michal Hocko 1 sibling, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-23 11:43 UTC (permalink / raw) To: Michal Hocko Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On 23.09.19 13:34, David Hildenbrand wrote: > On 23.09.19 13:15, Michal Hocko wrote: >> On Mon 23-09-19 11:31:30, David Hildenbrand wrote: >>> On 23.09.19 10:58, Michal Hocko wrote: >>>> On Fri 20-09-19 10:17:54, David Hildenbrand wrote: >>>>> On 09.09.19 13:48, David Hildenbrand wrote: >>>>>> Based on linux/next + "[PATCH 0/3] Remove __online_page_set_limits()" >>>>>> >>>>>> Let's replace the __online_page...() functions by generic_online_page(). >>>>>> Hyper-V only wants to delay the actual onlining of un-backed pages, so we >>>>>> can simpy re-use the generic function. >>>>>> >>>>>> Only compile-tested. >>>>>> >>>>>> Cc: Souptick Joarder <jrdr.linux@gmail.com> >>>>>> >>>>>> David Hildenbrand (3): >>>>>> mm/memory_hotplug: Export generic_online_page() >>>>>> hv_balloon: Use generic_online_page() >>>>>> mm/memory_hotplug: Remove __online_page_free() and >>>>>> __online_page_increment_counters() >>>>>> >>>>>> drivers/hv/hv_balloon.c | 3 +-- >>>>>> include/linux/memory_hotplug.h | 4 +--- >>>>>> mm/memory_hotplug.c | 17 ++--------------- >>>>>> 3 files changed, 4 insertions(+), 20 deletions(-) >>>>>> >>>>> >>>>> Ping, any comments on this one? >>>> >>>> Unification makes a lot of sense to me. You can add >>>> Acked-by: Michal Hocko <mhocko@suse.com> >>>> >>>> I will most likely won't surprise if I asked for more here though ;) >>> >>> I'm not surprised, but definitely not in a negative sense ;) I was >>> asking myself if we could somehow rework this, too. >>> >>>> I have to confess I really detest the whole concept of a hidden callback >>>> with a very weird API. Is this something we can do about? I do realize >>>> that adding a callback would require either cluttering the existing APIs >>>> but maybe we can come up with something more clever. Or maybe existing >>>> external users of online callback can do that as a separate step after >>>> the online is completed - or is this impossible due to locking >>>> guarantees? >>>> >>> >>> The use case of this (somewhat special) callback really is to avoid >>> selected (unbacked in the hypervisor) pages to get put to the buddy just >>> now, but instead to defer that (sometimes, defer till infinity ;) ). >>> Especially, to hinder these pages from getting touched at all. Pages >>> that won't be put to the buddy will usually get PG_offline set (e.g., >>> Hyper-V and XEN) - the only two users I am aware of. >>> >>> For Hyper-V (and also eventually virtio-mem), it is important to set >>> PG_offline before marking the section to be online (SECTION_IS_ONLINE). >>> Only this way, PG_offline is properly set on all pfn_to_online_page() >>> pages, meaning "don't touch this page" - e.g., used to skip over such >>> pages when suspending or by makedumpfile to skip over such offline pages >>> when creating a memory dump. >> >> Thanks for the clarification. I have never really studied what those >> callbacks are doing really. >> >>> So if we would e.g., try to piggy-back onto the memory_notify() >>> infrastructure, we could >>> 1. Online all pages to the buddy (dropping the callback) >>> 2. E.g., memory_notify(MEM_ONLINE_PAGES, &arg); >>> -> in the notifier, pull pages from the buddy, mark sections online >>> 3. Set all involved sections online (online_mem_sections()) >> >> This doesn't really sound any better. For one pages are immediately >> usable when they hit the buddy allocator so this is racy and thus not >> reliable. >> >>> However, I am not sure what actually happens after 1. - we are only >>> holding the device hotplug lock and the memory hotplug lock, so the >>> pages can just get allocated. Also, it sounds like more work and code >>> for the same end result (okay, if the rework is really necessary, though). >>> >>> So yeah, while the current callback might not be optimal, I don't see an >>> easy and clean way to rework this. With the change in this series we are >>> at least able to simply defer doing what would have been done without >>> the callback - not perfect but better. >>> >>> Do you have anything in mind that could work out and make this nicer? >> >> I am wondering why those pages get onlined when they are, in fact, >> supposed to be offline. >> > > It's the current way of emulating sub-memory-block hotplug on top of the > memory bock device API we have. Hyper-V and XEN have been using that for > a long time. > So one idea would be to let clients set pages to PG_offline during MEM_GOING_ONLINE. We could then skip any PG_offline pages when onlining pages, not onlining them to the buddy. But then, there still has to be a way to online pages when required - e.g., generic_online_page(). At least the callback could go. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 11:34 ` David Hildenbrand 2019-09-23 11:43 ` David Hildenbrand @ 2019-09-23 12:07 ` Michal Hocko 2019-09-23 12:20 ` David Hildenbrand 1 sibling, 1 reply; 14+ messages in thread From: Michal Hocko @ 2019-09-23 12:07 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On Mon 23-09-19 13:34:18, David Hildenbrand wrote: > On 23.09.19 13:15, Michal Hocko wrote: [...] > > I am wondering why those pages get onlined when they are, in fact, > > supposed to be offline. > > > > It's the current way of emulating sub-memory-block hotplug on top of the > memory bock device API we have. Hyper-V and XEN have been using that for > a long time. Do they really have to use the existing block interface when they in fact do not operate on the block granularity? Zone device memory already acts on sub section/block boundaries. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 12:07 ` Michal Hocko @ 2019-09-23 12:20 ` David Hildenbrand 2019-09-23 12:30 ` Michal Hocko 0 siblings, 1 reply; 14+ messages in thread From: David Hildenbrand @ 2019-09-23 12:20 UTC (permalink / raw) To: Michal Hocko Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On 23.09.19 14:07, Michal Hocko wrote: > On Mon 23-09-19 13:34:18, David Hildenbrand wrote: >> On 23.09.19 13:15, Michal Hocko wrote: > [...] >>> I am wondering why those pages get onlined when they are, in fact, >>> supposed to be offline. >>> >> >> It's the current way of emulating sub-memory-block hotplug on top of the >> memory bock device API we have. Hyper-V and XEN have been using that for >> a long time. > > Do they really have to use the existing block interface when they in > fact do not operate on the block granularity? Zone device memory already > acts on sub section/block boundaries. > Yes, we need memory blocks, especially for user space to properly online them (as we discussed a while back, to decide on a zone) and for udev events, to e.g., properly reload kexec when memory blocks get added/removed/onlined/offlined. ZONE_DEVICE is special as it doesn't have to worry about onlining/offlining/dumping. Also, it doesn't care about SECTION_IS_ONLINE, but figuring out how to detect which sub sections/blocks have a valid memmap is still something to get fixed and is still getting discussed on MM. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 12:20 ` David Hildenbrand @ 2019-09-23 12:30 ` Michal Hocko 2019-09-23 12:34 ` David Hildenbrand 0 siblings, 1 reply; 14+ messages in thread From: Michal Hocko @ 2019-09-23 12:30 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On Mon 23-09-19 14:20:05, David Hildenbrand wrote: > On 23.09.19 14:07, Michal Hocko wrote: > > On Mon 23-09-19 13:34:18, David Hildenbrand wrote: > >> On 23.09.19 13:15, Michal Hocko wrote: > > [...] > >>> I am wondering why those pages get onlined when they are, in fact, > >>> supposed to be offline. > >>> > >> > >> It's the current way of emulating sub-memory-block hotplug on top of the > >> memory bock device API we have. Hyper-V and XEN have been using that for > >> a long time. > > > > Do they really have to use the existing block interface when they in > > fact do not operate on the block granularity? Zone device memory already > > acts on sub section/block boundaries. > > > > Yes, we need memory blocks, especially for user space to properly online > them (as we discussed a while back, to decide on a zone) and for udev > events, to e.g., properly reload kexec when memory blocks get > added/removed/onlined/offlined. Just to make sure I really follow. We need a user interface to control where the memory gets onlined but it is the driver which determines which part of the block really gets onlined, right? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() 2019-09-23 12:30 ` Michal Hocko @ 2019-09-23 12:34 ` David Hildenbrand 0 siblings, 0 replies; 14+ messages in thread From: David Hildenbrand @ 2019-09-23 12:34 UTC (permalink / raw) To: Michal Hocko Cc: linux-kernel, linux-mm, Souptick Joarder, linux-hyperv, Andrew Morton, Dan Williams, Haiyang Zhang, K. Y. Srinivasan, Oscar Salvador, Pavel Tatashin, Qian Cai, Sasha Levin, Stephen Hemminger, Wei Yang On 23.09.19 14:30, Michal Hocko wrote: > On Mon 23-09-19 14:20:05, David Hildenbrand wrote: >> On 23.09.19 14:07, Michal Hocko wrote: >>> On Mon 23-09-19 13:34:18, David Hildenbrand wrote: >>>> On 23.09.19 13:15, Michal Hocko wrote: >>> [...] >>>>> I am wondering why those pages get onlined when they are, in fact, >>>>> supposed to be offline. >>>>> >>>> >>>> It's the current way of emulating sub-memory-block hotplug on top of the >>>> memory bock device API we have. Hyper-V and XEN have been using that for >>>> a long time. >>> >>> Do they really have to use the existing block interface when they in >>> fact do not operate on the block granularity? Zone device memory already >>> acts on sub section/block boundaries. >>> >> >> Yes, we need memory blocks, especially for user space to properly online >> them (as we discussed a while back, to decide on a zone) and for udev >> events, to e.g., properly reload kexec when memory blocks get >> added/removed/onlined/offlined. > > Just to make sure I really follow. We need a user interface to control > where the memory gets onlined but it is the driver which determines > which part of the block really gets onlined, right? > Yes, it's the drivers job to decide which part of the memory block can actually get used, and when. It's a little bit like the driver immediately allocates unbacked memory again, to free that memory to the buddy once requested. (similar to how ballooning works). -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2019-09-23 12:34 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-09 11:48 [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 1/3] " David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 2/3] hv_balloon: Use generic_online_page() David Hildenbrand 2019-09-09 11:48 ` [PATCH v1 3/3] mm/memory_hotplug: Remove __online_page_free() and __online_page_increment_counters() David Hildenbrand 2019-09-20 8:17 ` [PATCH v1 0/3] mm/memory_hotplug: Export generic_online_page() David Hildenbrand 2019-09-23 8:58 ` Michal Hocko 2019-09-23 9:31 ` David Hildenbrand 2019-09-23 11:15 ` Michal Hocko 2019-09-23 11:34 ` David Hildenbrand 2019-09-23 11:43 ` David Hildenbrand 2019-09-23 12:07 ` Michal Hocko 2019-09-23 12:20 ` David Hildenbrand 2019-09-23 12:30 ` Michal Hocko 2019-09-23 12:34 ` David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).