Some cleanups (+ one fix for a special case) in the context of online_pages(). Hope I am not missing something obvious. Did a sanity test with DIMMs only. v1 -> v2: - "mm/memory_hotplug: Handle unaligned start and nr_pages in online_pages_blocks()" -- Turned into "mm/memory_hotplug: make sure the pfn is aligned to the order when onlining" -- Dropped the "nr_pages not an order of two" condition for now as requested by Michal, but kept a simplified alignment check - "mm/memory_hotplug: Drop PageReserved() check in online_pages_range()" -- Split out from "mm/memory_hotplug: Simplify online_pages_range()" - "mm/memory_hotplug: Simplify online_pages_range()" -- Modified due to the other changes David Hildenbrand (5): resource: Use PFN_UP / PFN_DOWN in walk_system_ram_range() mm/memory_hotplug: Drop PageReserved() check in online_pages_range() mm/memory_hotplug: Simplify online_pages_range() mm/memory_hotplug: Make sure the pfn is aligned to the order when onlining mm/memory_hotplug: online_pages cannot be 0 in online_pages() kernel/resource.c | 4 +-- mm/memory_hotplug.c | 61 ++++++++++++++++++++------------------------- 2 files changed, 29 insertions(+), 36 deletions(-) -- 2.21.0
This makes it clearer that we will never call func() with duplicate PFNs in case we have multiple sub-page memory resources. All unaligned parts of PFNs are completely discarded. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Nadav Amit <namit@vmware.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- kernel/resource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/resource.c b/kernel/resource.c index 7ea4306503c5..88ee39fa9103 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -487,8 +487,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, while (start < end && !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, false, &res)) { - pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; - end_pfn = (res.end + 1) >> PAGE_SHIFT; + pfn = PFN_UP(res.start); + end_pfn = PFN_DOWN(res.end + 1); if (end_pfn > pfn) ret = (*func)(pfn, end_pfn - pfn, arg); if (ret) -- 2.21.0
move_pfn_range_to_zone() will set all pages to PG_reserved via memmap_init_zone(). The only way a page could no longer be reserved would be if a MEM_GOING_ONLINE notifier would clear PG_reserved - which is not done (the online_page callback is used for that purpose by e.g., Hyper-V instead). walk_system_ram_range() will never call online_pages_range() with duplicate PFNs, so drop the PageReserved() check. This seems to be a leftover from ancient times where the memmap was initialized when adding memory and we wanted to check for already onlined memory. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 3706a137d880..10ad970f3f14 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -653,9 +653,7 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, { unsigned long onlined_pages = *(unsigned long *)arg; - if (PageReserved(pfn_to_page(start_pfn))) - onlined_pages += online_pages_blocks(start_pfn, nr_pages); - + onlined_pages += online_pages_blocks(start_pfn, nr_pages); online_mem_sections(start_pfn, start_pfn + nr_pages); *(unsigned long *)arg = onlined_pages; -- 2.21.0
online_pages always corresponds to nr_pages. Simplify the code, getting rid of online_pages_blocks(). Add some comments. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 36 ++++++++++++++++-------------------- 1 file changed, 16 insertions(+), 20 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 10ad970f3f14..63b1775f7cf8 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -632,31 +632,27 @@ static void generic_online_page(struct page *page, unsigned int order) #endif } -static int online_pages_blocks(unsigned long start, unsigned long nr_pages) -{ - unsigned long end = start + nr_pages; - int order, onlined_pages = 0; - - while (start < end) { - order = min(MAX_ORDER - 1, - get_order(PFN_PHYS(end) - PFN_PHYS(start))); - (*online_page_callback)(pfn_to_page(start), order); - - onlined_pages += (1UL << order); - start += (1UL << order); - } - return onlined_pages; -} - static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, void *arg) { - unsigned long onlined_pages = *(unsigned long *)arg; + const unsigned long end_pfn = start_pfn + nr_pages; + unsigned long pfn; + int order; + + /* + * Online the pages. The callback might decide to keep some pages + * PG_reserved (to add them to the buddy later), but we still account + * them as being online/belonging to this zone ("present"). + */ + for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) { + order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn))); + (*online_page_callback)(pfn_to_page(pfn), order); + } - onlined_pages += online_pages_blocks(start_pfn, nr_pages); - online_mem_sections(start_pfn, start_pfn + nr_pages); + /* mark all involved sections as online */ + online_mem_sections(start_pfn, end_pfn); - *(unsigned long *)arg = onlined_pages; + *(unsigned long *)arg += nr_pages; return 0; } -- 2.21.0
Commit a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher order") assumed that any PFN we get via memory resources is aligned to to MAX_ORDER - 1, I am not convinced that is always true. Let's play safe, check the alignment and fallback to single pages. Cc: Arun KS <arunks@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 63b1775f7cf8..f245fb50ba7f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, */ for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) { order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn))); + /* __free_pages_core() wants pfns to be aligned to the order */ + if (unlikely(!IS_ALIGNED(pfn, 1ul << order))) + order = 0; (*online_page_callback)(pfn_to_page(pfn), order); } -- 2.21.0
walk_system_ram_range() will fail with -EINVAL in case online_pages_range() was never called (== no resource applicable in the range). Otherwise, we will always call online_pages_range() with nr_pages > 0 and, therefore, have online_pages > 0. Remove that special handling. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Dan Williams <dan.j.williams@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: David Hildenbrand <david@redhat.com> --- mm/memory_hotplug.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index f245fb50ba7f..01456fc66564 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -853,6 +853,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ ret = walk_system_ram_range(pfn, nr_pages, &onlined_pages, online_pages_range); if (ret) { + /* not a single memory resource was applicable */ if (need_zonelists_rebuild) zone_pcp_reset(zone); goto failed_addition; @@ -866,27 +867,22 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ shuffle_zone(zone); - if (onlined_pages) { - node_states_set_node(nid, &arg); - if (need_zonelists_rebuild) - build_all_zonelists(NULL); - else - zone_pcp_update(zone); - } + node_states_set_node(nid, &arg); + if (need_zonelists_rebuild) + build_all_zonelists(NULL); + else + zone_pcp_update(zone); init_per_zone_wmark_min(); - if (onlined_pages) { - kswapd_run(nid); - kcompactd_run(nid); - } + kswapd_run(nid); + kcompactd_run(nid); vm_total_pages = nr_free_pagecache_pages(); writeback_set_ratelimit(); - if (onlined_pages) - memory_notify(MEM_ONLINE, &arg); + memory_notify(MEM_ONLINE, &arg); mem_hotplug_done(); return 0; -- 2.21.0
On Wed 14-08-19 17:41:06, David Hildenbrand wrote: > move_pfn_range_to_zone() will set all pages to PG_reserved via > memmap_init_zone(). The only way a page could no longer be reserved > would be if a MEM_GOING_ONLINE notifier would clear PG_reserved - which > is not done (the online_page callback is used for that purpose by > e.g., Hyper-V instead). walk_system_ram_range() will never call > online_pages_range() with duplicate PFNs, so drop the PageReserved() check. > > This seems to be a leftover from ancient times where the memmap was > initialized when adding memory and we wanted to check for already > onlined memory. > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: David Hildenbrand <david@redhat.com> Thanks for spliting that up. Acked-by: Michal Hocko <mhocko@suse.com> > --- > mm/memory_hotplug.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 3706a137d880..10ad970f3f14 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -653,9 +653,7 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > { > unsigned long onlined_pages = *(unsigned long *)arg; > > - if (PageReserved(pfn_to_page(start_pfn))) > - onlined_pages += online_pages_blocks(start_pfn, nr_pages); > - > + onlined_pages += online_pages_blocks(start_pfn, nr_pages); > online_mem_sections(start_pfn, start_pfn + nr_pages); > > *(unsigned long *)arg = onlined_pages; > -- > 2.21.0 > -- Michal Hocko SUSE Labs
On Wed 14-08-19 17:41:07, David Hildenbrand wrote: > online_pages always corresponds to nr_pages. Simplify the code, getting > rid of online_pages_blocks(). Add some comments. > > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com> > Cc: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Thanks! > --- > mm/memory_hotplug.c | 36 ++++++++++++++++-------------------- > 1 file changed, 16 insertions(+), 20 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index 10ad970f3f14..63b1775f7cf8 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -632,31 +632,27 @@ static void generic_online_page(struct page *page, unsigned int order) > #endif > } > > -static int online_pages_blocks(unsigned long start, unsigned long nr_pages) > -{ > - unsigned long end = start + nr_pages; > - int order, onlined_pages = 0; > - > - while (start < end) { > - order = min(MAX_ORDER - 1, > - get_order(PFN_PHYS(end) - PFN_PHYS(start))); > - (*online_page_callback)(pfn_to_page(start), order); > - > - onlined_pages += (1UL << order); > - start += (1UL << order); > - } > - return onlined_pages; > -} > - > static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > void *arg) > { > - unsigned long onlined_pages = *(unsigned long *)arg; > + const unsigned long end_pfn = start_pfn + nr_pages; > + unsigned long pfn; > + int order; > + > + /* > + * Online the pages. The callback might decide to keep some pages > + * PG_reserved (to add them to the buddy later), but we still account > + * them as being online/belonging to this zone ("present"). > + */ > + for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) { > + order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn))); > + (*online_page_callback)(pfn_to_page(pfn), order); > + } > > - onlined_pages += online_pages_blocks(start_pfn, nr_pages); > - online_mem_sections(start_pfn, start_pfn + nr_pages); > + /* mark all involved sections as online */ > + online_mem_sections(start_pfn, end_pfn); > > - *(unsigned long *)arg = onlined_pages; > + *(unsigned long *)arg += nr_pages; > return 0; > } > > -- > 2.21.0 > -- Michal Hocko SUSE Labs
On 14.08.19 17:41, David Hildenbrand wrote:
> Commit a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher
> order") assumed that any PFN we get via memory resources is aligned to
> to MAX_ORDER - 1, I am not convinced that is always true. Let's play safe,
> check the alignment and fallback to single pages.
>
> Cc: Arun KS <arunks@codeaurora.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
> mm/memory_hotplug.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 63b1775f7cf8..f245fb50ba7f 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> */
> for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
> order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
> + /* __free_pages_core() wants pfns to be aligned to the order */
> + if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
> + order = 0;
> (*online_page_callback)(pfn_to_page(pfn), order);
> }
>
>
@Michal, if you insist, we can drop this patch. "break first and fix
later" is not part of my DNA :)
--
Thanks,
David / dhildenb
On Wed, Aug 14, 2019 at 05:41:05PM +0200, David Hildenbrand wrote: >This makes it clearer that we will never call func() with duplicate PFNs >in case we have multiple sub-page memory resources. All unaligned parts >of PFNs are completely discarded. > >Cc: Dan Williams <dan.j.williams@intel.com> >Cc: Borislav Petkov <bp@suse.de> >Cc: Andrew Morton <akpm@linux-foundation.org> >Cc: Bjorn Helgaas <bhelgaas@google.com> >Cc: Ingo Molnar <mingo@kernel.org> >Cc: Dave Hansen <dave.hansen@linux.intel.com> >Cc: Nadav Amit <namit@vmware.com> >Cc: Wei Yang <richardw.yang@linux.intel.com> >Cc: Oscar Salvador <osalvador@suse.de> >Acked-by: Michal Hocko <mhocko@suse.com> >Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Wei Yang <richardw.yang@linux.intel.com> >--- > kernel/resource.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > >diff --git a/kernel/resource.c b/kernel/resource.c >index 7ea4306503c5..88ee39fa9103 100644 >--- a/kernel/resource.c >+++ b/kernel/resource.c >@@ -487,8 +487,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages, > while (start < end && > !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, > false, &res)) { >- pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT; >- end_pfn = (res.end + 1) >> PAGE_SHIFT; >+ pfn = PFN_UP(res.start); >+ end_pfn = PFN_DOWN(res.end + 1); > if (end_pfn > pfn) > ret = (*func)(pfn, end_pfn - pfn, arg); > if (ret) >-- >2.21.0 -- Wei Yang Help you, Help me
On Wed 14-08-19 18:09:16, David Hildenbrand wrote:
> On 14.08.19 17:41, David Hildenbrand wrote:
> > Commit a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher
> > order") assumed that any PFN we get via memory resources is aligned to
> > to MAX_ORDER - 1, I am not convinced that is always true. Let's play safe,
> > check the alignment and fallback to single pages.
> >
> > Cc: Arun KS <arunks@codeaurora.org>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Oscar Salvador <osalvador@suse.de>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> > Cc: Dan Williams <dan.j.williams@intel.com>
> > Signed-off-by: David Hildenbrand <david@redhat.com>
> > ---
> > mm/memory_hotplug.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> > index 63b1775f7cf8..f245fb50ba7f 100644
> > --- a/mm/memory_hotplug.c
> > +++ b/mm/memory_hotplug.c
> > @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> > */
> > for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
> > order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
> > + /* __free_pages_core() wants pfns to be aligned to the order */
> > + if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
> > + order = 0;
> > (*online_page_callback)(pfn_to_page(pfn), order);
> > }
> >
> >
>
> @Michal, if you insist, we can drop this patch. "break first and fix
> later" is not part of my DNA :)
I do not insist but have already expressed that I am not a fan of this
change. Also I think that "break first" is quite an over statement here.
--
Michal Hocko
SUSE Labs
On 14.08.19 20:32, Michal Hocko wrote:
> On Wed 14-08-19 18:09:16, David Hildenbrand wrote:
>> On 14.08.19 17:41, David Hildenbrand wrote:
>>> Commit a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher
>>> order") assumed that any PFN we get via memory resources is aligned to
>>> to MAX_ORDER - 1, I am not convinced that is always true. Let's play safe,
>>> check the alignment and fallback to single pages.
>>>
>>> Cc: Arun KS <arunks@codeaurora.org>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>> Cc: Michal Hocko <mhocko@suse.com>
>>> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
>>> Cc: Dan Williams <dan.j.williams@intel.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>> ---
>>> mm/memory_hotplug.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index 63b1775f7cf8..f245fb50ba7f 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
>>> */
>>> for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
>>> order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
>>> + /* __free_pages_core() wants pfns to be aligned to the order */
>>> + if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
>>> + order = 0;
>>> (*online_page_callback)(pfn_to_page(pfn), order);
>>> }
>>>
>>>
>>
>> @Michal, if you insist, we can drop this patch. "break first and fix
>> later" is not part of my DNA :)
>
> I do not insist but have already expressed that I am not a fan of this
> change. Also I think that "break first" is quite an over statement here.
>
Well this version is certainly nicer than the previous one. I'll let
Andrew decide if he wants to pick it up or drop it from this series.
Thanks!
--
Thanks,
David / dhildenb
On Wed, 14 Aug 2019 17:41:08 +0200 David Hildenbrand <david@redhat.com> wrote:
> Commit a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher
> order") assumed that any PFN we get via memory resources is aligned to
> to MAX_ORDER - 1, I am not convinced that is always true. Let's play safe,
> check the alignment and fallback to single pages.
>
> ...
>
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> */
> for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
> order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
> + /* __free_pages_core() wants pfns to be aligned to the order */
> + if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
> + order = 0;
> (*online_page_callback)(pfn_to_page(pfn), order);
> }
We aren't sure if this occurs, but if it does, we silently handle it.
It seems a reasonable defensive thing to do, but should we add a
WARN_ON_ONCE() so that we get to find out about it? If we get such a
report then we can remove the WARN_ON_ONCE() and add an illuminating
comment.
On 14.08.19 22:56, Andrew Morton wrote:
> On Wed, 14 Aug 2019 17:41:08 +0200 David Hildenbrand <david@redhat.com> wrote:
>
>> Commit a9cd410a3d29 ("mm/page_alloc.c: memory hotplug: free pages as higher
>> order") assumed that any PFN we get via memory resources is aligned to
>> to MAX_ORDER - 1, I am not convinced that is always true. Let's play safe,
>> check the alignment and fallback to single pages.
>>
>> ...
>>
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
>> */
>> for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
>> order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
>> + /* __free_pages_core() wants pfns to be aligned to the order */
>> + if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
>> + order = 0;
>> (*online_page_callback)(pfn_to_page(pfn), order);
>> }
>
> We aren't sure if this occurs, but if it does, we silently handle it.
>
> It seems a reasonable defensive thing to do, but should we add a
> WARN_ON_ONCE() so that we get to find out about it? If we get such a
> report then we can remove the WARN_ON_ONCE() and add an illuminating
> comment.
>
>
Makes sense, do you want to add the WARN_ON_ONCE() or shall I resend?
I was recently thinking about limiting offlining to memory blocks
without holes - then also onlining would only apply to memory blocks
without holes and we could simplify both paths (single zone/node, no
holes) - including this check, we would always have memory block size
alignments. But I am not sure yet if there is a valid use case for
offlining/re-online boot memory with holes.
--
Thanks,
David / dhildenb
On Wed, 14 Aug 2019 23:47:24 +0200 David Hildenbrand <david@redhat.com> wrote:
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -646,6 +646,9 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> >> */
> >> for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
> >> order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
> >> + /* __free_pages_core() wants pfns to be aligned to the order */
> >> + if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
> >> + order = 0;
> >> (*online_page_callback)(pfn_to_page(pfn), order);
> >> }
> >
> > We aren't sure if this occurs, but if it does, we silently handle it.
> >
> > It seems a reasonable defensive thing to do, but should we add a
> > WARN_ON_ONCE() so that we get to find out about it? If we get such a
> > report then we can remove the WARN_ON_ONCE() and add an illuminating
> > comment.
> >
> >
>
> Makes sense, do you want to add the WARN_ON_ONCE() or shall I resend?
--- a/mm/memory_hotplug.c~mm-memory_hotplug-make-sure-the-pfn-is-aligned-to-the-order-when-onlining-fix
+++ a/mm/memory_hotplug.c
@@ -647,7 +647,7 @@ static int online_pages_range(unsigned l
for (pfn = start_pfn; pfn < end_pfn; pfn += 1ul << order) {
order = min(MAX_ORDER - 1, get_order(PFN_PHYS(end_pfn - pfn)));
/* __free_pages_core() wants pfns to be aligned to the order */
- if (unlikely(!IS_ALIGNED(pfn, 1ul << order)))
+ if (WARN_ON_ONCE(!IS_ALIGNED(pfn, 1ul << order)))
order = 0;
(*online_page_callback)(pfn_to_page(pfn), order);
}
_