* [PATCH v1 0/4] mm/memory_hotplug: Better error handling when removing memory @ 2019-04-09 10:01 David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand ` (3 more replies) 0 siblings, 4 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-09 10:01 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, David Hildenbrand, Andrew Banman, Andrew Morton, Andy Lutomirski, Arun KS, Benjamin Herrenschmidt, Borislav Petkov, Christophe Leroy, Dave Hansen, Fenghua Yu, Geert Uytterhoeven, Greg Kroah-Hartman, Heiko Carstens, H. Peter Anvin, Ingo Molnar, Ingo Molnar, Joonsoo Kim, Kirill A. Shutemov, Martin Schwidefsky, Masahiro Yamada, Mathieu Malaterre, Michael Ellerman, Michal Hocko, Mike Rapoport, mike.travis, Nicholas Piggin, Oscar Salvador, Oscar Salvador, Paul Mackerras, Pavel Tatashin, Peter Zijlstra, Qian Cai, Rafael J. Wysocki, Rich Felker, Rob Herring, Stefan Agner, Thomas Gleixner, Tony Luck, Vasily Gorbik, Wei Yang, Yoshinori Sato Error handling when removing memory is somewhat messed up right now. Some errors result in warnings, others are completely ignored. Memory unplug code can essentially not deal with errors properly as of now. remove_memory() will never fail. We have basically two choices: 1. Allow arch_remov_memory() and friends to fail, propagating errors via remove_memory(). Might be problematic (e.g. DIMMs consisting of multiple pieces added/removed separately). 2. Don't allow the functions to fail, handling errors in a nicer way. It seems like most errors that can theoretically happen are really corner cases and mostly theoretical (e.g. "section not valid"). However e.g. aborting removal of sections while all callers simply continue in case of errors is not nice. If we can gurantee that removal of memory always works (and WARN/skip in case of theoretical errors so we can figure out what is going on), we can go ahead and implement better error handling when adding memory. E.g. via add_memory(): arch_add_memory() ret = do_stuff() if (ret) { arch_remove_memory(); goto error; } Handling here that arch_remove_memory() might fail is basically impossible. So I suggest, let's avoid reporting errors while removing memory, warning on theoretical errors instead and continuing instead of aborting. Compile-tested on x86-64, powerpc, s390x. Tested on x86-64 with DIMMs. Based on git://git.cmpxchg.org/linux-mmots.git David Hildenbrand (4): mm/memory_hotplug: Release memory resource after arch_remove_memory() mm/memory_hotplug: Make unregister_memory_section() never fail mm/memory_hotplug: Make __remove_section() never fail mm/memory_hotplug: Make __remove_pages() and arch_remove_memory() never fail arch/ia64/mm/init.c | 11 ++---- arch/powerpc/mm/mem.c | 11 +++--- arch/s390/mm/init.c | 5 +-- arch/sh/mm/init.c | 11 ++---- arch/x86/mm/init_32.c | 5 +-- arch/x86/mm/init_64.c | 10 ++---- drivers/base/memory.c | 16 +++------ include/linux/memory.h | 2 +- include/linux/memory_hotplug.h | 8 ++--- mm/memory_hotplug.c | 63 +++++++++++++++++----------------- 10 files changed, 60 insertions(+), 82 deletions(-) -- 2.17.2 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-09 10:01 [PATCH v1 0/4] mm/memory_hotplug: Better error handling when removing memory David Hildenbrand @ 2019-04-09 10:01 ` David Hildenbrand 2019-04-09 22:41 ` Andrew Morton ` (3 more replies) 2019-04-09 10:01 ` [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail David Hildenbrand ` (2 subsequent siblings) 3 siblings, 4 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-09 10:01 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, David Hildenbrand, Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre __add_pages() doesn't add the memory resource, so __remove_pages() shouldn't remove it. Let's factor it out. Especially as it is a special case for memory used as system memory, added via add_memory() and friends. We now remove the resource after removing the sections instead of doing it the other way around. I don't think this change is problematic. add_memory() register memory resource arch_add_memory() remove_memory arch_remove_memory() release memory resource While at it, explain why we ignore errors and that it only happeny if we remove memory in a different granularity as we added it. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Qian Cai <cai@lca.pw> Cc: Arun KS <arunks@codeaurora.org> Cc: Mathieu Malaterre <malat@debian.org> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4970ff658055..696ed7ee5e28 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -562,20 +562,6 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, if (is_dev_zone(zone)) { if (altmap) map_offset = vmem_altmap_offset(altmap); - } else { - resource_size_t start, size; - - start = phys_start_pfn << PAGE_SHIFT; - size = nr_pages * PAGE_SIZE; - - ret = release_mem_region_adjustable(&iomem_resource, start, - size); - if (ret) { - resource_size_t endres = start + size - 1; - - pr_warn("Unable to release resource <%pa-%pa> (%d)\n", - &start, &endres, ret); - } } clear_zone_contiguous(zone); @@ -1820,6 +1806,25 @@ void try_offline_node(int nid) } EXPORT_SYMBOL(try_offline_node); +static void __release_memory_resource(u64 start, u64 size) +{ + int ret; + + /* + * When removing memory in the same granularity as it was added, + * this function never fails. It might only fail if resources + * have to be adjusted or split. We'll ignore the error, as + * removing of memory cannot fail. + */ + ret = release_mem_region_adjustable(&iomem_resource, start, size); + if (ret) { + resource_size_t endres = start + size - 1; + + pr_warn("Unable to release resource <%pa-%pa> (%d)\n", + &start, &endres, ret); + } +} + /** * remove_memory * @nid: the node ID @@ -1854,6 +1859,7 @@ void __ref __remove_memory(int nid, u64 start, u64 size) memblock_remove(start, size); arch_remove_memory(nid, start, size, NULL); + __release_memory_resource(start, size); try_offline_node(nid); -- 2.17.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand @ 2019-04-09 22:41 ` Andrew Morton 2019-04-10 8:07 ` David Hildenbrand 2019-04-17 11:52 ` Oscar Salvador ` (2 subsequent siblings) 3 siblings, 1 reply; 19+ messages in thread From: Andrew Morton @ 2019-04-09 22:41 UTC (permalink / raw) To: David Hildenbrand Cc: linux-mm, linux-kernel, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On Tue, 9 Apr 2019 12:01:45 +0200 David Hildenbrand <david@redhat.com> wrote: > __add_pages() doesn't add the memory resource, so __remove_pages() > shouldn't remove it. Let's factor it out. Especially as it is a special > case for memory used as system memory, added via add_memory() and > friends. > > We now remove the resource after removing the sections instead of doing > it the other way around. I don't think this change is problematic. > > add_memory() > register memory resource > arch_add_memory() > > remove_memory > arch_remove_memory() > release memory resource > > While at it, explain why we ignore errors and that it only happeny if > we remove memory in a different granularity as we added it. Seems sane. > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1820,6 +1806,25 @@ void try_offline_node(int nid) > } > EXPORT_SYMBOL(try_offline_node); > > +static void __release_memory_resource(u64 start, u64 size) > +{ > + int ret; > + > + /* > + * When removing memory in the same granularity as it was added, > + * this function never fails. It might only fail if resources > + * have to be adjusted or split. We'll ignore the error, as > + * removing of memory cannot fail. > + */ > + ret = release_mem_region_adjustable(&iomem_resource, start, size); > + if (ret) { > + resource_size_t endres = start + size - 1; > + > + pr_warn("Unable to release resource <%pa-%pa> (%d)\n", > + &start, &endres, ret); > + } > +} The types seem confused here. Should `start' and `size' be resource_size_t? Or maybe phys_addr_t. release_mem_region_adjustable() takes resource_size_t's. Is %pa the way to print a resource_size_t? I guess it happens to work because resource_size_t happens to map onto phys_addr_t, which isn't ideal. Wanna have a headscratch over that? > /** > * remove_memory > * @nid: the node ID > @@ -1854,6 +1859,7 @@ void __ref __remove_memory(int nid, u64 start, u64 size) > memblock_remove(start, size); > > arch_remove_memory(nid, start, size, NULL); > + __release_memory_resource(start, size); > > try_offline_node(nid); ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-09 22:41 ` Andrew Morton @ 2019-04-10 8:07 ` David Hildenbrand 2019-04-17 3:37 ` Andrew Morton 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2019-04-10 8:07 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On 10.04.19 00:41, Andrew Morton wrote: > On Tue, 9 Apr 2019 12:01:45 +0200 David Hildenbrand <david@redhat.com> wrote: > >> __add_pages() doesn't add the memory resource, so __remove_pages() >> shouldn't remove it. Let's factor it out. Especially as it is a special >> case for memory used as system memory, added via add_memory() and >> friends. >> >> We now remove the resource after removing the sections instead of doing >> it the other way around. I don't think this change is problematic. >> >> add_memory() >> register memory resource >> arch_add_memory() >> >> remove_memory >> arch_remove_memory() >> release memory resource >> >> While at it, explain why we ignore errors and that it only happeny if >> we remove memory in a different granularity as we added it. > > Seems sane. > >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1820,6 +1806,25 @@ void try_offline_node(int nid) >> } >> EXPORT_SYMBOL(try_offline_node); >> >> +static void __release_memory_resource(u64 start, u64 size) >> +{ >> + int ret; >> + >> + /* >> + * When removing memory in the same granularity as it was added, >> + * this function never fails. It might only fail if resources >> + * have to be adjusted or split. We'll ignore the error, as >> + * removing of memory cannot fail. >> + */ >> + ret = release_mem_region_adjustable(&iomem_resource, start, size); >> + if (ret) { >> + resource_size_t endres = start + size - 1; >> + >> + pr_warn("Unable to release resource <%pa-%pa> (%d)\n", >> + &start, &endres, ret); >> + } >> +} > > The types seem confused here. Should `start' and `size' be > resource_size_t? Or maybe phys_addr_t. Hmm, right now it has the same prototype as register_memory_resource. I guess using resource_size_t is the right thing to do. > > release_mem_region_adjustable() takes resource_size_t's. > > Is %pa the way to print a resource_size_t? I guess it happens to work > because resource_size_t happens to map onto phys_addr_t, which isn't > ideal. Documentation/core-api/printk-formats.rst " %pa[p] 0x01234567 or 0x0123456789abcdef For printing a phys_addr_t type (and its derivatives, such as resource_size_t) ... " Care to fixup both u64 to resource_size_t? Or should I send a patch? Whatever you prefer. Thanks! -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-10 8:07 ` David Hildenbrand @ 2019-04-17 3:37 ` Andrew Morton 2019-04-17 7:44 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Andrew Morton @ 2019-04-17 3:37 UTC (permalink / raw) To: David Hildenbrand Cc: linux-mm, linux-kernel, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On Wed, 10 Apr 2019 10:07:24 +0200 David Hildenbrand <david@redhat.com> wrote: > Care to fixup both u64 to resource_size_t? Or should I send a patch? > Whatever you prefer. Please send along a fixup. This patch series has no evidence of having been reviewed :(. Can you suggest who could help us out here? ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-17 3:37 ` Andrew Morton @ 2019-04-17 7:44 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-17 7:44 UTC (permalink / raw) To: Andrew Morton Cc: linux-mm, linux-kernel, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On 17.04.19 05:37, Andrew Morton wrote: > On Wed, 10 Apr 2019 10:07:24 +0200 David Hildenbrand <david@redhat.com> wrote: > >> Care to fixup both u64 to resource_size_t? Or should I send a patch? >> Whatever you prefer. > > Please send along a fixup. Will do! > > This patch series has no evidence of having been reviewed :(. Can you > suggest who could help us out here? Usually I would say Oscar and Michal, but they seem to be fairly busy. I guess only the first patch is a real change, the other three patches are mostly function prototype changes, handling errors in a nicer way. Thanks! -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand 2019-04-09 22:41 ` Andrew Morton @ 2019-04-17 11:52 ` Oscar Salvador 2019-04-17 12:02 ` [PATCH] mm/memory_hotplug: Fixup "Release memory resource after arch_remove_memory()" David Hildenbrand 2019-04-17 13:12 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() Michal Hocko 3 siblings, 0 replies; 19+ messages in thread From: Oscar Salvador @ 2019-04-17 11:52 UTC (permalink / raw) To: David Hildenbrand, linux-mm Cc: linux-kernel, Andrew Morton, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On Tue, 2019-04-09 at 12:01 +0200, David Hildenbrand wrote: > __add_pages() doesn't add the memory resource, so __remove_pages() > shouldn't remove it. Let's factor it out. Especially as it is a > special > case for memory used as system memory, added via add_memory() and > friends. I would call the special case the other way, aka: zone_device hooking into hotplug path. > > We now remove the resource after removing the sections instead of > doing > it the other way around. I don't think this change is problematic. > > add_memory() > register memory resource > arch_add_memory() > > remove_memory > arch_remove_memory() > release memory resource > > While at it, explain why we ignore errors and that it only happeny if > we remove memory in a different granularity as we added it. In the future we may want to allow drivers to hook directly into arch_add_memory()/arch_remove_memory(), and this will lead to different granularity in hot_add/hot_remove operations. At least that was one of the conclusions I drew from the last vmemmap- patchset. So, we will have to see how we can handle those kind of errors. > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com> > Cc: Wei Yang <richard.weiyang@gmail.com> > Cc: Qian Cai <cai@lca.pw> > Cc: Arun KS <arunks@codeaurora.org> > Cc: Mathieu Malaterre <malat@debian.org> > Signed-off-by: David Hildenbrand <david@redhat.com> Besides what Andrew pointed out about the types of start,size, I do not see anything wrong: Reviewed-by: Oscar Salvador <osalvador@suse.de> > --- > mm/memory_hotplug.c | 34 ++++++++++++++++++++-------------- > 1 file changed, 20 insertions(+), 14 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 4970ff658055..696ed7ee5e28 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -562,20 +562,6 @@ int __remove_pages(struct zone *zone, unsigned > long phys_start_pfn, > if (is_dev_zone(zone)) { > if (altmap) > map_offset = vmem_altmap_offset(altmap); > - } else { > - resource_size_t start, size; > - > - start = phys_start_pfn << PAGE_SHIFT; > - size = nr_pages * PAGE_SIZE; > - > - ret = release_mem_region_adjustable(&iomem_resource, > start, > - size); > - if (ret) { > - resource_size_t endres = start + size - 1; > - > - pr_warn("Unable to release resource <%pa- > %pa> (%d)\n", > - &start, &endres, ret); > - } > } > > clear_zone_contiguous(zone); > @@ -1820,6 +1806,25 @@ void try_offline_node(int nid) > } > EXPORT_SYMBOL(try_offline_node); > > +static void __release_memory_resource(u64 start, u64 size) > +{ > + int ret; > + > + /* > + * When removing memory in the same granularity as it was > added, > + * this function never fails. It might only fail if > resources > + * have to be adjusted or split. We'll ignore the error, as > + * removing of memory cannot fail. > + */ > + ret = release_mem_region_adjustable(&iomem_resource, start, > size); > + if (ret) { > + resource_size_t endres = start + size - 1; > + > + pr_warn("Unable to release resource <%pa-%pa> > (%d)\n", > + &start, &endres, ret); > + } > +} > + > /** > * remove_memory > * @nid: the node ID > @@ -1854,6 +1859,7 @@ void __ref __remove_memory(int nid, u64 start, > u64 size) > memblock_remove(start, size); > > arch_remove_memory(nid, start, size, NULL); > + __release_memory_resource(start, size); > > try_offline_node(nid); > -- Oscar Salvador SUSE L3 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] mm/memory_hotplug: Fixup "Release memory resource after arch_remove_memory()" 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand 2019-04-09 22:41 ` Andrew Morton 2019-04-17 11:52 ` Oscar Salvador @ 2019-04-17 12:02 ` David Hildenbrand 2019-04-17 13:12 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() Michal Hocko 3 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-17 12:02 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, David Hildenbrand, Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre pr_warn with "%pa" expects a pointer to phys_addr_t or resource_size_t. So let's propery use resource_size_t for the start address and size, we are dealing with resources after all. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Qian Cai <cai@lca.pw> Cc: Arun KS <arunks@codeaurora.org> Cc: Mathieu Malaterre <malat@debian.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 093f6dc66f46..328878b6799d 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1800,7 +1800,8 @@ void try_offline_node(int nid) } EXPORT_SYMBOL(try_offline_node); -static void __release_memory_resource(u64 start, u64 size) +static void __release_memory_resource(resource_size_t start, + resource_size_t size) { int ret; -- 2.20.1 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand ` (2 preceding siblings ...) 2019-04-17 12:02 ` [PATCH] mm/memory_hotplug: Fixup "Release memory resource after arch_remove_memory()" David Hildenbrand @ 2019-04-17 13:12 ` Michal Hocko 2019-04-17 13:24 ` David Hildenbrand 3 siblings, 1 reply; 19+ messages in thread From: Michal Hocko @ 2019-04-17 13:12 UTC (permalink / raw) To: David Hildenbrand Cc: linux-mm, linux-kernel, Andrew Morton, Oscar Salvador, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On Tue 09-04-19 12:01:45, David Hildenbrand wrote: > __add_pages() doesn't add the memory resource, so __remove_pages() > shouldn't remove it. Let's factor it out. Especially as it is a special > case for memory used as system memory, added via add_memory() and > friends. > > We now remove the resource after removing the sections instead of doing > it the other way around. I don't think this change is problematic. > > add_memory() > register memory resource > arch_add_memory() > > remove_memory > arch_remove_memory() > release memory resource > > While at it, explain why we ignore errors and that it only happeny if > we remove memory in a different granularity as we added it. OK, I agree that the symmetry is good in general and it certainly makes sense here as well. But does it make sense to pick up this particular part without larger considerations of add vs. remove apis? I have a strong feeling this wouldn't be the only thing to care about. In other words does this help future changes or it is more likely to cause more code conflicts with other features being developed right now? > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com> > Cc: Wei Yang <richard.weiyang@gmail.com> > Cc: Qian Cai <cai@lca.pw> > Cc: Arun KS <arunks@codeaurora.org> > Cc: Mathieu Malaterre <malat@debian.org> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > mm/memory_hotplug.c | 34 ++++++++++++++++++++-------------- > 1 file changed, 20 insertions(+), 14 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 4970ff658055..696ed7ee5e28 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -562,20 +562,6 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, > if (is_dev_zone(zone)) { > if (altmap) > map_offset = vmem_altmap_offset(altmap); > - } else { > - resource_size_t start, size; > - > - start = phys_start_pfn << PAGE_SHIFT; > - size = nr_pages * PAGE_SIZE; > - > - ret = release_mem_region_adjustable(&iomem_resource, start, > - size); > - if (ret) { > - resource_size_t endres = start + size - 1; > - > - pr_warn("Unable to release resource <%pa-%pa> (%d)\n", > - &start, &endres, ret); > - } > } > > clear_zone_contiguous(zone); > @@ -1820,6 +1806,25 @@ void try_offline_node(int nid) > } > EXPORT_SYMBOL(try_offline_node); > > +static void __release_memory_resource(u64 start, u64 size) > +{ > + int ret; > + > + /* > + * When removing memory in the same granularity as it was added, > + * this function never fails. It might only fail if resources > + * have to be adjusted or split. We'll ignore the error, as > + * removing of memory cannot fail. > + */ > + ret = release_mem_region_adjustable(&iomem_resource, start, size); > + if (ret) { > + resource_size_t endres = start + size - 1; > + > + pr_warn("Unable to release resource <%pa-%pa> (%d)\n", > + &start, &endres, ret); > + } > +} > + > /** > * remove_memory > * @nid: the node ID > @@ -1854,6 +1859,7 @@ void __ref __remove_memory(int nid, u64 start, u64 size) > memblock_remove(start, size); > > arch_remove_memory(nid, start, size, NULL); > + __release_memory_resource(start, size); > > try_offline_node(nid); > > -- > 2.17.2 > -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-17 13:12 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() Michal Hocko @ 2019-04-17 13:24 ` David Hildenbrand 2019-04-17 13:31 ` Michal Hocko 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2019-04-17 13:24 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-kernel, Andrew Morton, Oscar Salvador, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On 17.04.19 15:12, Michal Hocko wrote: > On Tue 09-04-19 12:01:45, David Hildenbrand wrote: >> __add_pages() doesn't add the memory resource, so __remove_pages() >> shouldn't remove it. Let's factor it out. Especially as it is a special >> case for memory used as system memory, added via add_memory() and >> friends. >> >> We now remove the resource after removing the sections instead of doing >> it the other way around. I don't think this change is problematic. >> >> add_memory() >> register memory resource >> arch_add_memory() >> >> remove_memory >> arch_remove_memory() >> release memory resource >> >> While at it, explain why we ignore errors and that it only happeny if >> we remove memory in a different granularity as we added it. > > OK, I agree that the symmetry is good in general and it certainly makes > sense here as well. But does it make sense to pick up this particular > part without larger considerations of add vs. remove apis? I have a > strong feeling this wouldn't be the only thing to care about. In other > words does this help future changes or it is more likely to cause more > code conflicts with other features being developed right now? I am planning to 1. factor out memory block device handling, so features like sub-section add/remove are easier to add internally. Move it to the user that wants it. Clean up all the mess we have right now due to supporting memory block devices that span several sections. 2. Make sure that any arch_add_pages() and friends clean up properly if they fail instead of indicating failure but leaving some partially added memory lying around. 3. Clean up node handling regarding to memory hotplug/unplug. Especially don't allow to offline/remove memory spanning several nodes etc. IOW, in order to properly clean up memory block device handling and prepare for more changes people are interested in (e.g. sub-section add of device memory), this is the right thing to do. The other parts are bigger changes. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-17 13:24 ` David Hildenbrand @ 2019-04-17 13:31 ` Michal Hocko 2019-04-17 13:48 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Michal Hocko @ 2019-04-17 13:31 UTC (permalink / raw) To: David Hildenbrand Cc: linux-mm, linux-kernel, Andrew Morton, Oscar Salvador, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On Wed 17-04-19 15:24:47, David Hildenbrand wrote: > On 17.04.19 15:12, Michal Hocko wrote: > > On Tue 09-04-19 12:01:45, David Hildenbrand wrote: > >> __add_pages() doesn't add the memory resource, so __remove_pages() > >> shouldn't remove it. Let's factor it out. Especially as it is a special > >> case for memory used as system memory, added via add_memory() and > >> friends. > >> > >> We now remove the resource after removing the sections instead of doing > >> it the other way around. I don't think this change is problematic. > >> > >> add_memory() > >> register memory resource > >> arch_add_memory() > >> > >> remove_memory > >> arch_remove_memory() > >> release memory resource > >> > >> While at it, explain why we ignore errors and that it only happeny if > >> we remove memory in a different granularity as we added it. > > > > OK, I agree that the symmetry is good in general and it certainly makes > > sense here as well. But does it make sense to pick up this particular > > part without larger considerations of add vs. remove apis? I have a > > strong feeling this wouldn't be the only thing to care about. In other > > words does this help future changes or it is more likely to cause more > > code conflicts with other features being developed right now? > > I am planning to > > 1. factor out memory block device handling, so features like sub-section > add/remove are easier to add internally. Move it to the user that wants > it. Clean up all the mess we have right now due to supporting memory > block devices that span several sections. > > 2. Make sure that any arch_add_pages() and friends clean up properly if > they fail instead of indicating failure but leaving some partially added > memory lying around. > > 3. Clean up node handling regarding to memory hotplug/unplug. Especially > don't allow to offline/remove memory spanning several nodes etc. Yes, this all sounds sane to me. > IOW, in order to properly clean up memory block device handling and > prepare for more changes people are interested in (e.g. sub-section add > of device memory), this is the right thing to do. The other parts are > bigger changes. This would be really valuable to have in the cover. Beause there is so much to clean up in this mess but making random small cleanups without a larger plan tends to step on others toes more than being useful. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() 2019-04-17 13:31 ` Michal Hocko @ 2019-04-17 13:48 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-17 13:48 UTC (permalink / raw) To: Michal Hocko Cc: linux-mm, linux-kernel, Andrew Morton, Oscar Salvador, Pavel Tatashin, Wei Yang, Qian Cai, Arun KS, Mathieu Malaterre On 17.04.19 15:31, Michal Hocko wrote: > On Wed 17-04-19 15:24:47, David Hildenbrand wrote: >> On 17.04.19 15:12, Michal Hocko wrote: >>> On Tue 09-04-19 12:01:45, David Hildenbrand wrote: >>>> __add_pages() doesn't add the memory resource, so __remove_pages() >>>> shouldn't remove it. Let's factor it out. Especially as it is a special >>>> case for memory used as system memory, added via add_memory() and >>>> friends. >>>> >>>> We now remove the resource after removing the sections instead of doing >>>> it the other way around. I don't think this change is problematic. >>>> >>>> add_memory() >>>> register memory resource >>>> arch_add_memory() >>>> >>>> remove_memory >>>> arch_remove_memory() >>>> release memory resource >>>> >>>> While at it, explain why we ignore errors and that it only happeny if >>>> we remove memory in a different granularity as we added it. >>> >>> OK, I agree that the symmetry is good in general and it certainly makes >>> sense here as well. But does it make sense to pick up this particular >>> part without larger considerations of add vs. remove apis? I have a >>> strong feeling this wouldn't be the only thing to care about. In other >>> words does this help future changes or it is more likely to cause more >>> code conflicts with other features being developed right now? >> >> I am planning to >> >> 1. factor out memory block device handling, so features like sub-section >> add/remove are easier to add internally. Move it to the user that wants >> it. Clean up all the mess we have right now due to supporting memory >> block devices that span several sections. >> >> 2. Make sure that any arch_add_pages() and friends clean up properly if >> they fail instead of indicating failure but leaving some partially added >> memory lying around. >> >> 3. Clean up node handling regarding to memory hotplug/unplug. Especially >> don't allow to offline/remove memory spanning several nodes etc. > > Yes, this all sounds sane to me. > >> IOW, in order to properly clean up memory block device handling and >> prepare for more changes people are interested in (e.g. sub-section add >> of device memory), this is the right thing to do. The other parts are >> bigger changes. > > This would be really valuable to have in the cover. Beause there is so > much to clean up in this mess but making random small cleanups without a > larger plan tends to step on others toes more than being useful. I agree, let's discuss the bigger plan I have in mind 1. arch_add_memory()/arch_remove_memory() don't deal with memory block devices. add_memory()/remove_memory()/online_pages()/offline_pages() do. 2. add_memory()/remove_memory()/online_pages()/offline_pages() - Only work on memory block device alignment/granularity - Only work on single nodes. - Only work on single zones. 3. mem->nid correctly indicates if - Memory block devices belongs to single node / no node / multiple nodes - Fast and reliable way to detect remove_memory()/online_pages()/offline_pages() being called with multiple nodes. 4. arch_remove_memory() and friends never fail. Removing of memory always succeeds. This allows better error handling when adding of memory fails. We will move some parts from CONFIG_MEMORY_HOTREMOVE to CONFIG_MEMORY_HOTPLUG, so we can use them to clean up if adding of memory fails. 5. Remove all accesses to struct_page from memory removal path. Pages might never have been initialized, they should not be touched. All other features I see on the horizon (vmemmap on added memory, sub-section hot-add) would mainly be centered around arch_add_memory()/arch_remove_memory(), which would not have to deal with any special cases around memory block device memory. Do you agree, do you have any other points/things in mind you consider important? -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail 2019-04-09 10:01 [PATCH v1 0/4] mm/memory_hotplug: Better error handling when removing memory David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand @ 2019-04-09 10:01 ` David Hildenbrand 2019-04-17 12:45 ` Oscar Salvador 2019-04-09 10:01 ` [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() " David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 4/4] mm/memory_hotplug: Make __remove_pages() and arch_remove_memory() " David Hildenbrand 3 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2019-04-09 10:01 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, David Hildenbrand, Greg Kroah-Hartman, Rafael J. Wysocki, Ingo Molnar, Andrew Banman, mike.travis, Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS, Mathieu Malaterre Failing while removing memory is mostly ignored and cannot really be handled. Let's treat errors in unregister_memory_section() in a nice way, warning, but continuing. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: "mike.travis@hpe.com" <mike.travis@hpe.com> Cc: David Hildenbrand <david@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Qian Cai <cai@lca.pw> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Arun KS <arunks@codeaurora.org> Cc: Mathieu Malaterre <malat@debian.org> Signed-off-by: David Hildenbrand <david@redhat.com> --- drivers/base/memory.c | 16 +++++----------- include/linux/memory.h | 2 +- mm/memory_hotplug.c | 4 +--- 3 files changed, 7 insertions(+), 15 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 0c9e22ffa47a..f180427e48f4 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -734,15 +734,18 @@ unregister_memory(struct memory_block *memory) { BUG_ON(memory->dev.bus != &memory_subsys); - /* drop the ref. we got in remove_memory_section() */ + /* drop the ref. we got via find_memory_block() */ put_device(&memory->dev); device_unregister(&memory->dev); } -static int remove_memory_section(struct mem_section *section) +void unregister_memory_section(struct mem_section *section) { struct memory_block *mem; + if (WARN_ON_ONCE(!present_section(section))) + return; + mutex_lock(&mem_sysfs_mutex); /* @@ -763,15 +766,6 @@ static int remove_memory_section(struct mem_section *section) out_unlock: mutex_unlock(&mem_sysfs_mutex); - return 0; -} - -int unregister_memory_section(struct mem_section *section) -{ - if (!present_section(section)) - return -EINVAL; - - return remove_memory_section(section); } #endif /* CONFIG_MEMORY_HOTREMOVE */ diff --git a/include/linux/memory.h b/include/linux/memory.h index a6ddefc60517..e1dc1bb2b787 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -113,7 +113,7 @@ extern int register_memory_isolate_notifier(struct notifier_block *nb); extern void unregister_memory_isolate_notifier(struct notifier_block *nb); int hotplug_memory_register(int nid, struct mem_section *section); #ifdef CONFIG_MEMORY_HOTREMOVE -extern int unregister_memory_section(struct mem_section *); +extern void unregister_memory_section(struct mem_section *); #endif extern int memory_dev_init(void); extern int memory_notify(unsigned long val, void *v); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 696ed7ee5e28..b0cb05748f99 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -527,9 +527,7 @@ static int __remove_section(struct zone *zone, struct mem_section *ms, if (!valid_section(ms)) return ret; - ret = unregister_memory_section(ms); - if (ret) - return ret; + unregister_memory_section(ms); scn_nr = __section_nr(ms); start_pfn = section_nr_to_pfn((unsigned long)scn_nr); -- 2.17.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail 2019-04-09 10:01 ` [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail David Hildenbrand @ 2019-04-17 12:45 ` Oscar Salvador 2019-04-17 13:10 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Oscar Salvador @ 2019-04-17 12:45 UTC (permalink / raw) To: David Hildenbrand, linux-mm Cc: linux-kernel, Greg Kroah-Hartman, Rafael J. Wysocki, Ingo Molnar, Andrew Banman, mike.travis, Andrew Morton, Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS, Mathieu Malaterre On Tue, 2019-04-09 at 12:01 +0200, David Hildenbrand wrote: > Failing while removing memory is mostly ignored and cannot really be > handled. Let's treat errors in unregister_memory_section() in a nice > way, warning, but continuing. > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: "Rafael J. Wysocki" <rafael@kernel.org> > Cc: Ingo Molnar <mingo@kernel.org> > Cc: Andrew Banman <andrew.banman@hpe.com> > Cc: "mike.travis@hpe.com" <mike.travis@hpe.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com> > Cc: Qian Cai <cai@lca.pw> > Cc: Wei Yang <richard.weiyang@gmail.com> > Cc: Arun KS <arunks@codeaurora.org> > Cc: Mathieu Malaterre <malat@debian.org> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > drivers/base/memory.c | 16 +++++----------- > include/linux/memory.h | 2 +- > mm/memory_hotplug.c | 4 +--- > 3 files changed, 7 insertions(+), 15 deletions(-) > > diff --git a/drivers/base/memory.c b/drivers/base/memory.c > index 0c9e22ffa47a..f180427e48f4 100644 > --- a/drivers/base/memory.c > +++ b/drivers/base/memory.c > @@ -734,15 +734,18 @@ unregister_memory(struct memory_block *memory) > { > BUG_ON(memory->dev.bus != &memory_subsys); > > - /* drop the ref. we got in remove_memory_section() */ > + /* drop the ref. we got via find_memory_block() */ > put_device(&memory->dev); > device_unregister(&memory->dev); > } > > -static int remove_memory_section(struct mem_section *section) > +void unregister_memory_section(struct mem_section *section) > { > struct memory_block *mem; > > + if (WARN_ON_ONCE(!present_section(section))) > + return; > + > mutex_lock(&mem_sysfs_mutex); > > /* > @@ -763,15 +766,6 @@ static int remove_memory_section(struct > mem_section *section) > > out_unlock: > mutex_unlock(&mem_sysfs_mutex); > - return 0; > -} > - > -int unregister_memory_section(struct mem_section *section) > -{ > - if (!present_section(section)) > - return -EINVAL; > - > - return remove_memory_section(section); > } > #endif /* CONFIG_MEMORY_HOTREMOVE */ > > diff --git a/include/linux/memory.h b/include/linux/memory.h > index a6ddefc60517..e1dc1bb2b787 100644 > --- a/include/linux/memory.h > +++ b/include/linux/memory.h > @@ -113,7 +113,7 @@ extern int > register_memory_isolate_notifier(struct notifier_block *nb); > extern void unregister_memory_isolate_notifier(struct notifier_block > *nb); > int hotplug_memory_register(int nid, struct mem_section *section); > #ifdef CONFIG_MEMORY_HOTREMOVE > -extern int unregister_memory_section(struct mem_section *); > +extern void unregister_memory_section(struct mem_section *); > #endif > extern int memory_dev_init(void); > extern int memory_notify(unsigned long val, void *v); > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 696ed7ee5e28..b0cb05748f99 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -527,9 +527,7 @@ static int __remove_section(struct zone *zone, > struct mem_section *ms, > if (!valid_section(ms)) > return ret; > > - ret = unregister_memory_section(ms); > - if (ret) > - return ret; > + unregister_memory_section(ms); So, technically unregister_memory_section() can __only__ fail in case the section is not present, returning -EINVAL. Now, I was checking how the pair valid/present sections work. Unless I am mistaken, we only mark sections as memmap those sections that are present. This can come from two paths: - Early boot: * memblocks_present memory_present - mark sections as present sparse_init - iterates only over present sections sparse_init_nid sparse_init_one_section - mark section as valid - Hotplug path: * sparse_add_one_section section_mark_present - mark section as present sparse_init_one_section - mark section as valid During early boot, sparse_init iterates __only__ over present sections, so only those are marked valid as well, and during hotplug, the section is both marked present and valid. All in all, I think that we are safe if we remove the present_section check in your new unregister_memory_section(), as a valid_section cannot be non-present, and we do already catch those early in __remove_section(). Then, the only thing missing to be completely error-less in that codepath is to make unregister_mem_sect_under_nodes() void return-type. Not that it matters a lot as we are already ignoring its return code, but I find that quite disturbing and wrong. So, would you like to take this patch in your patchset in case you re- submit? From: Oscar Salvador <osalvador@suse.de> Date: Wed, 17 Apr 2019 14:41:32 +0200 Subject: [PATCH] mm,memory_hotplug: Refactor unregister_mem_sect_under_nodes Currently, the return code of unregister_mem_sect_under_nodes gets ignored. It can only fail in case we are not able to allocate a nodemask_t, but such allocation is too small, so it is not really clear we can actually fail. The maximum a nodemask_t can be is 128 bytes, so we can make the whole thing more simple if we simply allocate a nodemask_t within the stack. With this change, we can convert unregister_mem_sect_under_nodes to be void-return type. Signed-off-by: Oscar Salvador <osalvador@suse.de> --- drivers/base/node.c | 16 ++++------------ include/linux/node.h | 5 ++--- 2 files changed, 6 insertions(+), 15 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 8598fcbd2a17..fcd0f442e73d 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -802,19 +802,13 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg) } /* unregister memory section under all nodes that it spans */ -int unregister_mem_sect_under_nodes(struct memory_block *mem_blk, +void unregister_mem_sect_under_nodes(struct memory_block *mem_blk, unsigned long phys_index) { - NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL); + nodemask_t unlinked_nodes; unsigned long pfn, sect_start_pfn, sect_end_pfn; - if (!mem_blk) { - NODEMASK_FREE(unlinked_nodes); - return -EFAULT; - } - if (!unlinked_nodes) - return -ENOMEM; - nodes_clear(*unlinked_nodes); + nodes_clear(unlinked_nodes); sect_start_pfn = section_nr_to_pfn(phys_index); sect_end_pfn = sect_start_pfn + PAGES_PER_SECTION - 1; @@ -826,15 +820,13 @@ int unregister_mem_sect_under_nodes(struct memory_block *mem_blk, continue; if (!node_online(nid)) continue; - if (node_test_and_set(nid, *unlinked_nodes)) + if (node_test_and_set(nid, unlinked_nodes)) continue; sysfs_remove_link(&node_devices[nid]->dev.kobj, kobject_name(&mem_blk->dev.kobj)); sysfs_remove_link(&mem_blk->dev.kobj, kobject_name(&node_devices[nid]->dev.kobj)); } - NODEMASK_FREE(unlinked_nodes); - return 0; } int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn) diff --git a/include/linux/node.h b/include/linux/node.h index 1a557c589ecb..094ec9922bf5 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -139,7 +139,7 @@ extern int register_cpu_under_node(unsigned int cpu, unsigned int nid); extern int unregister_cpu_under_node(unsigned int cpu, unsigned int nid); extern int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg); -extern int unregister_mem_sect_under_nodes(struct memory_block *mem_blk, +extern void unregister_mem_sect_under_nodes(struct memory_block *mem_blk, unsigned long phys_index); extern int register_memory_node_under_compute_node(unsigned int mem_nid, @@ -176,10 +176,9 @@ static inline int register_mem_sect_under_node(struct memory_block *mem_blk, { return 0; } -static inline int unregister_mem_sect_under_nodes(struct memory_block *mem_blk, +static inline void unregister_mem_sect_under_nodes(struct memory_block *mem_blk, unsigned long phys_index) { - return 0; } static inline void register_hugetlbfs_with_node(node_registration_func_t reg, -- 2.12.3 -- Oscar Salvador SUSE L3 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail 2019-04-17 12:45 ` Oscar Salvador @ 2019-04-17 13:10 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-17 13:10 UTC (permalink / raw) To: Oscar Salvador, linux-mm Cc: linux-kernel, Greg Kroah-Hartman, Rafael J. Wysocki, Ingo Molnar, Andrew Banman, mike.travis, Andrew Morton, Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS, Mathieu Malaterre On 17.04.19 14:45, Oscar Salvador wrote: > On Tue, 2019-04-09 at 12:01 +0200, David Hildenbrand wrote: >> Failing while removing memory is mostly ignored and cannot really be >> handled. Let's treat errors in unregister_memory_section() in a nice >> way, warning, but continuing. >> >> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> >> Cc: "Rafael J. Wysocki" <rafael@kernel.org> >> Cc: Ingo Molnar <mingo@kernel.org> >> Cc: Andrew Banman <andrew.banman@hpe.com> >> Cc: "mike.travis@hpe.com" <mike.travis@hpe.com> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Oscar Salvador <osalvador@suse.de> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> >> Cc: Qian Cai <cai@lca.pw> >> Cc: Wei Yang <richard.weiyang@gmail.com> >> Cc: Arun KS <arunks@codeaurora.org> >> Cc: Mathieu Malaterre <malat@debian.org> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> drivers/base/memory.c | 16 +++++----------- >> include/linux/memory.h | 2 +- >> mm/memory_hotplug.c | 4 +--- >> 3 files changed, 7 insertions(+), 15 deletions(-) >> >> diff --git a/drivers/base/memory.c b/drivers/base/memory.c >> index 0c9e22ffa47a..f180427e48f4 100644 >> --- a/drivers/base/memory.c >> +++ b/drivers/base/memory.c >> @@ -734,15 +734,18 @@ unregister_memory(struct memory_block *memory) >> { >> BUG_ON(memory->dev.bus != &memory_subsys); >> >> - /* drop the ref. we got in remove_memory_section() */ >> + /* drop the ref. we got via find_memory_block() */ >> put_device(&memory->dev); >> device_unregister(&memory->dev); >> } >> >> -static int remove_memory_section(struct mem_section *section) >> +void unregister_memory_section(struct mem_section *section) >> { >> struct memory_block *mem; >> >> + if (WARN_ON_ONCE(!present_section(section))) >> + return; >> + >> mutex_lock(&mem_sysfs_mutex); >> >> /* >> @@ -763,15 +766,6 @@ static int remove_memory_section(struct >> mem_section *section) >> >> out_unlock: >> mutex_unlock(&mem_sysfs_mutex); >> - return 0; >> -} >> - >> -int unregister_memory_section(struct mem_section *section) >> -{ >> - if (!present_section(section)) >> - return -EINVAL; >> - >> - return remove_memory_section(section); >> } >> #endif /* CONFIG_MEMORY_HOTREMOVE */ >> >> diff --git a/include/linux/memory.h b/include/linux/memory.h >> index a6ddefc60517..e1dc1bb2b787 100644 >> --- a/include/linux/memory.h >> +++ b/include/linux/memory.h >> @@ -113,7 +113,7 @@ extern int >> register_memory_isolate_notifier(struct notifier_block *nb); >> extern void unregister_memory_isolate_notifier(struct notifier_block >> *nb); >> int hotplug_memory_register(int nid, struct mem_section *section); >> #ifdef CONFIG_MEMORY_HOTREMOVE >> -extern int unregister_memory_section(struct mem_section *); >> +extern void unregister_memory_section(struct mem_section *); >> #endif >> extern int memory_dev_init(void); >> extern int memory_notify(unsigned long val, void *v); >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index 696ed7ee5e28..b0cb05748f99 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -527,9 +527,7 @@ static int __remove_section(struct zone *zone, >> struct mem_section *ms, >> if (!valid_section(ms)) >> return ret; >> >> - ret = unregister_memory_section(ms); >> - if (ret) >> - return ret; >> + unregister_memory_section(ms); > > So, technically unregister_memory_section() can __only__ fail in case > the section is not present, returning -EINVAL. > > Now, I was checking how the pair valid/present sections work. > Unless I am mistaken, we only mark sections as memmap those sections > that are present. > > This can come from two paths: > > - Early boot: > * memblocks_present > memory_present - mark sections as present > sparse_init - iterates only over present sections > sparse_init_nid > sparse_init_one_section - mark section as valid > > - Hotplug path: > * sparse_add_one_section > section_mark_present - mark section as present > sparse_init_one_section - mark section as valid > > > During early boot, sparse_init iterates __only__ over present sections, > so only those are marked valid as well, and during hotplug, the section > is both marked present and valid. > > All in all, I think that we are safe if we remove the present_section > check in your new unregister_memory_section(), as a valid_section > cannot be non-present, and we do already catch those early in > __remove_section(). Yes, dropping the check completely would be the next step. > > Then, the only thing missing to be completely error-less in that > codepath is to make unregister_mem_sect_under_nodes() void return-type. > Not that it matters a lot as we are already ignoring its return code, > but I find that quite disturbing and wrong. > > So, would you like to take this patch in your patchset in case you re- > submit? I already had a look at that approach but would like to defer it, until I eventually factor out creation/deletion of memory devices (see my other RFC, still some work has to be done). Reasons: 1. We only support unplug of system ram memory belonging to exactly one node. No need to iterate over all nodes. mem->node should later tell us exactly what needs to be removed. No need to iterate at all. 2. unregister_mem_sect_under_nodes() should be replaced by "unregister_mem_block_under_nodes()". We only support unplug of whole memory blocks. Ripping out random sections from a memory block is not going to work and is not supported. IOW, this is some leftover when people turned memory block devices to span multiple sections. Once we clean that up, this problem goes away. -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() never fail 2019-04-09 10:01 [PATCH v1 0/4] mm/memory_hotplug: Better error handling when removing memory David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail David Hildenbrand @ 2019-04-09 10:01 ` David Hildenbrand 2019-04-17 13:56 ` Oscar Salvador 2019-04-09 10:01 ` [PATCH v1 4/4] mm/memory_hotplug: Make __remove_pages() and arch_remove_memory() " David Hildenbrand 3 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2019-04-09 10:01 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, David Hildenbrand, Andrew Morton, Oscar Salvador, Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS, Mathieu Malaterre Let's just warn in case a section is not valid instead of failing to remove somewhere in the middle of the process, returning an error that will be mostly ignored by callers. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: David Hildenbrand <david@redhat.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Qian Cai <cai@lca.pw> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Arun KS <arunks@codeaurora.org> Cc: Mathieu Malaterre <malat@debian.org> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index b0cb05748f99..17a60281c36f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -517,15 +517,15 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) pgdat_resize_unlock(zone->zone_pgdat, &flags); } -static int __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, struct vmem_altmap *altmap) +static void __remove_section(struct zone *zone, struct mem_section *ms, + unsigned long map_offset, + struct vmem_altmap *altmap) { unsigned long start_pfn; int scn_nr; - int ret = -EINVAL; - if (!valid_section(ms)) - return ret; + if (WARN_ON_ONCE(!valid_section(ms))) + return; unregister_memory_section(ms); @@ -534,7 +534,6 @@ static int __remove_section(struct zone *zone, struct mem_section *ms, __remove_zone(zone, start_pfn); sparse_remove_one_section(zone, ms, map_offset, altmap); - return 0; } /** @@ -554,7 +553,7 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, { unsigned long i; unsigned long map_offset = 0; - int sections_to_remove, ret = 0; + int sections_to_remove; /* In the ZONE_DEVICE case device driver owns the memory region */ if (is_dev_zone(zone)) { @@ -575,16 +574,13 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; cond_resched(); - ret = __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + __remove_section(zone, __pfn_to_section(pfn), map_offset, + altmap); map_offset = 0; - if (ret) - break; } set_zone_contiguous(zone); - - return ret; + return 0; } #endif /* CONFIG_MEMORY_HOTREMOVE */ -- 2.17.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() never fail 2019-04-09 10:01 ` [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() " David Hildenbrand @ 2019-04-17 13:56 ` Oscar Salvador 2019-04-24 6:54 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Oscar Salvador @ 2019-04-17 13:56 UTC (permalink / raw) To: David Hildenbrand, linux-mm Cc: linux-kernel, Andrew Morton, Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS, Mathieu Malaterre On Tue, 2019-04-09 at 12:01 +0200, David Hildenbrand wrote: > Let's just warn in case a section is not valid instead of failing to > remove somewhere in the middle of the process, returning an error > that > will be mostly ignored by callers. > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com> > Cc: Qian Cai <cai@lca.pw> > Cc: Wei Yang <richard.weiyang@gmail.com> > Cc: Arun KS <arunks@codeaurora.org> > Cc: Mathieu Malaterre <malat@debian.org> > Signed-off-by: David Hildenbrand <david@redhat.com> Just a nit: I think this could be combined with patch#2. The only reason to fail in here is 1) !valid_section 2) !present_section. As I stated in patch#2, one cannot be without the other, so makes sense to rip present_section check from unregister_mem_section() as well. Then, you could combine both changelogs explaining the whole thing, and why we do not need the present_section check either. But the change looks good to me: Reviewed-by: Oscar Salvador <osalvador@suse.de> > --- > mm/memory_hotplug.c | 22 +++++++++------------- > 1 file changed, 9 insertions(+), 13 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index b0cb05748f99..17a60281c36f 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -517,15 +517,15 @@ static void __remove_zone(struct zone *zone, > unsigned long start_pfn) > pgdat_resize_unlock(zone->zone_pgdat, &flags); > } > > -static int __remove_section(struct zone *zone, struct mem_section > *ms, > - unsigned long map_offset, struct vmem_altmap > *altmap) > +static void __remove_section(struct zone *zone, struct mem_section > *ms, > + unsigned long map_offset, > + struct vmem_altmap *altmap) > { > unsigned long start_pfn; > int scn_nr; > - int ret = -EINVAL; > > - if (!valid_section(ms)) > - return ret; > + if (WARN_ON_ONCE(!valid_section(ms))) > + return; > > unregister_memory_section(ms); > > @@ -534,7 +534,6 @@ static int __remove_section(struct zone *zone, > struct mem_section *ms, > __remove_zone(zone, start_pfn); > > sparse_remove_one_section(zone, ms, map_offset, altmap); > - return 0; > } > > /** > @@ -554,7 +553,7 @@ int __remove_pages(struct zone *zone, unsigned > long phys_start_pfn, > { > unsigned long i; > unsigned long map_offset = 0; > - int sections_to_remove, ret = 0; > + int sections_to_remove; > > /* In the ZONE_DEVICE case device driver owns the memory > region */ > if (is_dev_zone(zone)) { > @@ -575,16 +574,13 @@ int __remove_pages(struct zone *zone, unsigned > long phys_start_pfn, > unsigned long pfn = phys_start_pfn + > i*PAGES_PER_SECTION; > > cond_resched(); > - ret = __remove_section(zone, __pfn_to_section(pfn), > map_offset, > - altmap); > + __remove_section(zone, __pfn_to_section(pfn), > map_offset, > + altmap); > map_offset = 0; > - if (ret) > - break; > } > > set_zone_contiguous(zone); > - > - return ret; > + return 0; > } > #endif /* CONFIG_MEMORY_HOTREMOVE */ > -- Oscar Salvador SUSE L3 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() never fail 2019-04-17 13:56 ` Oscar Salvador @ 2019-04-24 6:54 ` David Hildenbrand 0 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-24 6:54 UTC (permalink / raw) To: Oscar Salvador, linux-mm Cc: linux-kernel, Andrew Morton, Michal Hocko, Pavel Tatashin, Qian Cai, Wei Yang, Arun KS, Mathieu Malaterre On 17.04.19 15:56, Oscar Salvador wrote: > On Tue, 2019-04-09 at 12:01 +0200, David Hildenbrand wrote: >> Let's just warn in case a section is not valid instead of failing to >> remove somewhere in the middle of the process, returning an error >> that >> will be mostly ignored by callers. >> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Oscar Salvador <osalvador@suse.de> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: David Hildenbrand <david@redhat.com> >> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> >> Cc: Qian Cai <cai@lca.pw> >> Cc: Wei Yang <richard.weiyang@gmail.com> >> Cc: Arun KS <arunks@codeaurora.org> >> Cc: Mathieu Malaterre <malat@debian.org> >> Signed-off-by: David Hildenbrand <david@redhat.com> > > Just a nit: > > I think this could be combined with patch#2. > The only reason to fail in here is 1) !valid_section 2) > !present_section. > As I stated in patch#2, one cannot be without the other, so makes sense > to rip present_section check from unregister_mem_section() as well. > Then, you could combine both changelogs explaining the whole thing, and > why we do not need the present_section check either. > If I have to resend the whole thing, I might do that. Otherwise we can drop the present_section() based on your explanation later. Thanks! > But the change looks good to me: > > Reviewed-by: Oscar Salvador <osalvador@suse.de> -- Thanks, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v1 4/4] mm/memory_hotplug: Make __remove_pages() and arch_remove_memory() never fail 2019-04-09 10:01 [PATCH v1 0/4] mm/memory_hotplug: Better error handling when removing memory David Hildenbrand ` (2 preceding siblings ...) 2019-04-09 10:01 ` [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() " David Hildenbrand @ 2019-04-09 10:01 ` David Hildenbrand 3 siblings, 0 replies; 19+ messages in thread From: David Hildenbrand @ 2019-04-09 10:01 UTC (permalink / raw) To: linux-mm Cc: linux-kernel, David Hildenbrand, Tony Luck, Fenghua Yu, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, Martin Schwidefsky, Heiko Carstens, Yoshinori Sato, Rich Felker, Dave Hansen, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, Andrew Morton, Michal Hocko, Mike Rapoport, Oscar Salvador, Kirill A. Shutemov, Christophe Leroy, Stefan Agner, Nicholas Piggin, Pavel Tatashin, Vasily Gorbik, Arun KS, Geert Uytterhoeven, Masahiro Yamada, Rob Herring, Joonsoo Kim, Wei Yang, Qian Cai, Mathieu Malaterre, linux-ia64, linuxppc-dev, linux-sh, linux-s390 All callers of arch_remove_memory() ignore errors. And we should really try to remove any errors from the memory removal path. No more errors are reported from __remove_pages(). BUG() in s390x code in case arch_remove_memory() is triggered. We may implement that properly later. WARN in case powerpc code failed to remove the section mapping, which is better than ignoring the error completely right now. Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Oscar Salvador <osalvador@suse.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Stefan Agner <stefan@agner.ch> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Arun KS <arunks@codeaurora.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Rob Herring <robh@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Qian Cai <cai@lca.pw> Cc: Mathieu Malaterre <malat@debian.org> Cc: linux-ia64@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-sh@vger.kernel.org Cc: linux-s390@vger.kernel.org Signed-off-by: David Hildenbrand <david@redhat.com> --- arch/ia64/mm/init.c | 11 +++-------- arch/powerpc/mm/mem.c | 11 ++++------- arch/s390/mm/init.c | 5 +++-- arch/sh/mm/init.c | 11 +++-------- arch/x86/mm/init_32.c | 5 +++-- arch/x86/mm/init_64.c | 10 +++------- include/linux/memory_hotplug.h | 8 ++++---- mm/memory_hotplug.c | 5 ++--- 8 files changed, 25 insertions(+), 41 deletions(-) diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index 379eb1f9adc9..d28e29103bdb 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -682,20 +682,15 @@ int arch_add_memory(int nid, u64 start, u64 size, } #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct zone *zone; - int ret; zone = page_zone(pfn_to_page(start_pfn)); - ret = __remove_pages(zone, start_pfn, nr_pages, altmap); - if (ret) - pr_warn("%s: Problem encountered in __remove_pages() as" - " ret=%d\n", __func__, ret); - - return ret; + __remove_pages(zone, start_pfn, nr_pages, altmap); } #endif #endif diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 76deaa8525db..cc9425fb9056 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -131,8 +131,8 @@ int __meminit arch_add_memory(int nid, u64 start, u64 size, } #ifdef CONFIG_MEMORY_HOTREMOVE -int __meminit arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) +void __meminit arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; @@ -147,14 +147,13 @@ int __meminit arch_remove_memory(int nid, u64 start, u64 size, if (altmap) page += vmem_altmap_offset(altmap); - ret = __remove_pages(page_zone(page), start_pfn, nr_pages, altmap); - if (ret) - return ret; + __remove_pages(page_zone(page), start_pfn, nr_pages, altmap); /* Remove htab bolted mappings for this section of memory */ start = (unsigned long)__va(start); flush_inval_dcache_range(start, start + size); ret = remove_section_mapping(start, start + size); + WARN_ON_ONCE(ret); /* Ensure all vmalloc mappings are flushed in case they also * hit that section of memory @@ -162,8 +161,6 @@ int __meminit arch_remove_memory(int nid, u64 start, u64 size, vm_unmap_aliases(); resize_hpt_for_hotplug(memblock_phys_mem_size()); - - return ret; } #endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index f5db961ad792..31b1071315d7 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -234,14 +234,15 @@ int arch_add_memory(int nid, u64 start, u64 size, } #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { /* * There is no hardware or firmware interface which could trigger a * hot memory remove on s390. So there is nothing that needs to be * implemented. */ - return -EBUSY; + BUG(); } #endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c index 168d3a6b9358..5aeb4d7099a1 100644 --- a/arch/sh/mm/init.c +++ b/arch/sh/mm/init.c @@ -429,20 +429,15 @@ EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); #endif #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = PFN_DOWN(start); unsigned long nr_pages = size >> PAGE_SHIFT; struct zone *zone; - int ret; zone = page_zone(pfn_to_page(start_pfn)); - ret = __remove_pages(zone, start_pfn, nr_pages, altmap); - if (unlikely(ret)) - pr_warn("%s: Failed, __remove_pages() == %d\n", __func__, - ret); - - return ret; + __remove_pages(zone, start_pfn, nr_pages, altmap); } #endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 755dbed85531..075e568098f2 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -860,14 +860,15 @@ int arch_add_memory(int nid, u64 start, u64 size, } #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) +void arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct zone *zone; zone = page_zone(pfn_to_page(start_pfn)); - return __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(zone, start_pfn, nr_pages, altmap); } #endif #endif diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index db42c11b48fb..20d14254b686 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1141,24 +1141,20 @@ kernel_physical_mapping_remove(unsigned long start, unsigned long end) remove_pagetable(start, end, true, NULL); } -int __ref arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) +void __ref arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; struct page *page = pfn_to_page(start_pfn); struct zone *zone; - int ret; /* With altmap the first mapped page is offset from @start */ if (altmap) page += vmem_altmap_offset(altmap); zone = page_zone(page); - ret = __remove_pages(zone, start_pfn, nr_pages, altmap); - WARN_ON_ONCE(ret); + __remove_pages(zone, start_pfn, nr_pages, altmap); kernel_physical_mapping_remove(start, start + size); - - return ret; } #endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index d0a145ffa4fe..9d0efac902ec 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -112,10 +112,10 @@ static inline bool movable_node_is_enabled(void) } #ifdef CONFIG_MEMORY_HOTREMOVE -extern int arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap); -extern int __remove_pages(struct zone *zone, unsigned long start_pfn, - unsigned long nr_pages, struct vmem_altmap *altmap); +extern void arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap); +extern void __remove_pages(struct zone *zone, unsigned long start_pfn, + unsigned long nr_pages, struct vmem_altmap *altmap); #endif /* CONFIG_MEMORY_HOTREMOVE */ /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 17a60281c36f..52fef4a81e4c 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -548,8 +548,8 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) +void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, + unsigned long nr_pages, struct vmem_altmap *altmap) { unsigned long i; unsigned long map_offset = 0; @@ -580,7 +580,6 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, } set_zone_contiguous(zone); - return 0; } #endif /* CONFIG_MEMORY_HOTREMOVE */ -- 2.17.2 ^ permalink raw reply related [flat|nested] 19+ messages in thread
end of thread, other threads:[~2019-04-24 6:54 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-04-09 10:01 [PATCH v1 0/4] mm/memory_hotplug: Better error handling when removing memory David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() David Hildenbrand 2019-04-09 22:41 ` Andrew Morton 2019-04-10 8:07 ` David Hildenbrand 2019-04-17 3:37 ` Andrew Morton 2019-04-17 7:44 ` David Hildenbrand 2019-04-17 11:52 ` Oscar Salvador 2019-04-17 12:02 ` [PATCH] mm/memory_hotplug: Fixup "Release memory resource after arch_remove_memory()" David Hildenbrand 2019-04-17 13:12 ` [PATCH v1 1/4] mm/memory_hotplug: Release memory resource after arch_remove_memory() Michal Hocko 2019-04-17 13:24 ` David Hildenbrand 2019-04-17 13:31 ` Michal Hocko 2019-04-17 13:48 ` David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 2/4] mm/memory_hotplug: Make unregister_memory_section() never fail David Hildenbrand 2019-04-17 12:45 ` Oscar Salvador 2019-04-17 13:10 ` David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 3/4] mm/memory_hotplug: Make __remove_section() " David Hildenbrand 2019-04-17 13:56 ` Oscar Salvador 2019-04-24 6:54 ` David Hildenbrand 2019-04-09 10:01 ` [PATCH v1 4/4] mm/memory_hotplug: Make __remove_pages() and arch_remove_memory() " David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).