* [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages @ 2020-12-03 15:23 carver4lio 2020-12-04 13:42 ` Qian Cai 2020-12-06 11:55 ` Mike Rapoport 0 siblings, 2 replies; 8+ messages in thread From: carver4lio @ 2020-12-03 15:23 UTC (permalink / raw) To: rppt; +Cc: akpm, linux-mm, linux-kernel, Hailong Liu From: Hailong Liu <liu.hailong6@zte.com.cn> When system in the booting stage, pages span from [start, end] of a memblock are freed to buddy in a order as large as possible (less than MAX_ORDER) at first, then decrease gradually to a proper order(less than end) in a loop. However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order in some cases. Instead, *__ffs(end - start)* may be more appropriate and meaningful. Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> --- mm/memblock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memblock.c b/mm/memblock.c index b68ee8678..7c6d0dde7 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1931,7 +1931,7 @@ static void __init __free_pages_memory(unsigned long start, unsigned long end) int order; while (start < end) { - order = min(MAX_ORDER - 1UL, __ffs(start)); + order = min(MAX_ORDER - 1UL, __ffs(end - start)); while (start + (1UL << order) > end) order--; -- 2.17.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages 2020-12-03 15:23 [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages carver4lio @ 2020-12-04 13:42 ` Qian Cai [not found] ` <CGME20201204160751eucas1p13cc7aad8c68dd2a495c4bbf422c4228c@eucas1p1.samsung.com> 2020-12-06 11:55 ` Mike Rapoport 1 sibling, 1 reply; 8+ messages in thread From: Qian Cai @ 2020-12-04 13:42 UTC (permalink / raw) To: carver4lio, rppt Cc: akpm, linux-mm, linux-kernel, Hailong Liu, Stephen Rothwell, Linux Next Mailing List On Thu, 2020-12-03 at 23:23 +0800, carver4lio@163.com wrote: > From: Hailong Liu <liu.hailong6@zte.com.cn> > > When system in the booting stage, pages span from [start, end] of a memblock > are freed to buddy in a order as large as possible (less than MAX_ORDER) at > first, then decrease gradually to a proper order(less than end) in a loop. > > However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order > in some cases. > Instead, *__ffs(end - start)* may be more appropriate and meaningful. > > Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> Reverting this commit on the top of today's linux-next fixed boot crashes on multiple NUMA systems. [ 5.050736][ T0] flags: 0x3fffc000000000() [ 5.055103][ T0] raw: 003fffc000000000 ffffea0000000448 ffffea0000000448 0000000000000000 [ 5.063572][ T0] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 5.072045][ T0] page dumped because: VM_BUG_ON_PAGE(pfn & ((1 << order) - 1)) [ 5.079580][ T0] ------------[ cut here ]------------ [ 5.084883][ T0] kernel BUG at mm/page_alloc.c:1015! [ 5.090151][ T0] invalid opcode: 0000 [#1] SMP KASAN NOPTI [ 5.095894][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0-rc6-next-20201204+ #11 [ 5.104099][ T0] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019 [ 5.113370][ T0] RIP: 0010:__free_one_page+0xa19/0x1140 [ 5.118864][ T0] Code: d2 e9 69 f6 ff ff 0f 0b 48 c7 c6 e0 52 2d a5 4c 89 ff e8 7a 98 f8 ff 0f 0b 0f 0b 48 c7 c6 60 53 2d a5 4c 89 ff e8 67 98 f8 ff <0f> 0b 48 c7 c6 c0 53 2d a5 4c 89 ff e8 56 98 f8 ff 0f 0b 48 89 da [ 5.138427][ T0] RSP: 0000:ffffffffa5807c30 EFLAGS: 00010086 [ 5.144367][ T0] RAX: 0000000000000000 RBX: 0000000000000008 RCX: ffffffffa3c4abf4 [ 5.152228][ T0] RDX: 1ffffd400000008f RSI: 0000000000000000 RDI: ffffea0000000478 [ 5.160091][ T0] RBP: 0000000000000007 R08: fffffbfff5918fc5 R09: fffffbfff5918fc5 [ 5.167951][ T0] R10: ffffffffac8c7e23 R11: fffffbfff5918fc4 R12: 0000000000000000 [ 5.175815][ T0] R13: 0000000000000003 R14: ffff88887fff6000 R15: ffffea0000000440 [ 5.183677][ T0] FS: 0000000000000000(0000) GS:ffff88881e800000(0000) knlGS:0000000000000000 [ 5.192499][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.198963][ T0] CR2: ffff88907efff000 CR3: 0000000ce3e14000 CR4: 00000000000406b0 [ 5.206823][ T0] Call Trace: [ 5.209978][ T0] ? rwlock_bug.part.1+0x90/0x90 [ 5.214774][ T0] free_one_page+0x7e/0x1e0 [ 5.219142][ T0] __free_pages_ok+0x646/0x13b0 [ 5.223863][ T0] memblock_free_all+0x21c/0x2c0 (inlined by) __free_memory_core at mm/memblock.c:2037 (inlined by) free_low_memory_core_early at mm/memblock.c:2060 (inlined by) memblock_free_all at mm/memblock.c:2100 [ 5.228662][ T0] ? reset_all_zones_managed_pages+0x9a/0x9a [ 5.234515][ T0] ? memblock_alloc_try_nid+0xe6/0x127 [ 5.239842][ T0] ? memblock_alloc_try_nid_raw+0x12a/0x12a [ 5.245610][ T0] ? early_amd_iommu_init+0x1e1f/0x1e1f [ 5.251024][ T0] ? iommu_go_to_state+0x24/0x28 [ 5.255831][ T0] mem_init+0x1a/0x350 [ 5.259762][ T0] mm_init+0x5f/0x87 [ 5.263515][ T0] start_kernel+0x14c/0x3a7 [ 5.267882][ T0] ? copy_bootdata+0x19/0x47 [ 5.272340][ T0] secondary_startup_64_no_verify+0xc2/0xcb [ 5.278102][ T0] Modules linked in: [ 5.281869][ T0] random: get_random_bytes called from print_oops_end_marker+0x26/0x40 with crng_init=0 [ 5.281878][ T0] ---[ end trace 32dd7228cc16af82 ]--- [ 5.296795][ T0] RIP: 0010:__free_one_page+0xa19/0x1140 [ 5.302299][ T0] Code: d2 e9 69 f6 ff ff 0f 0b 48 c7 c6 e0 52 2d a5 4c 89 ff e8 7a 98 f8 ff 0f 0b 0f 0b 48 c7 c6 60 53 2d a5 4c 89 ff e8 67 98 f8 ff <0f> 0b 48 c7 c6 c0 53 2d a5 4c 89 ff e8 56 98 f8 ff 0f 0b 48 89 da [ 5.321864][ T0] RSP: 0000:ffffffffa5807c30 EFLAGS: 00010086 [ 5.327803][ T0] RAX: 0000000000000000 RBX: 0000000000000008 RCX: ffffffffa3c4abf4 [ 5.335665][ T0] RDX: 1ffffd400000008f RSI: 0000000000000000 RDI: ffffea0000000478 [ 5.343526][ T0] RBP: 0000000000000007 R08: fffffbfff5918fc5 R09: fffffbfff5918fc5 [ 5.351389][ T0] R10: ffffffffac8c7e23 R11: fffffbfff5918fc4 R12: 0000000000000000 [ 5.359249][ T0] R13: 0000000000000003 R14: ffff88887fff6000 R15: ffffea0000000440 [ 5.367110][ T0] FS: 0000000000000000(0000) GS:ffff88881e800000(0000) knlGS:0000000000000000 [ 5.375932][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.382397][ T0] CR2: ffff88907efff000 CR3: 0000000ce3e14000 CR4: 00000000000406b0 [ 5.390261][ T0] Kernel panic - not syncing: Fatal exception [ 5.396320][ T0] ---[ end Kernel panic - not syncing: Fatal exception ]--- > --- > mm/memblock.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index b68ee8678..7c6d0dde7 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1931,7 +1931,7 @@ static void __init __free_pages_memory(unsigned long > start, unsigned long end) > int order; > > while (start < end) { > - order = min(MAX_ORDER - 1UL, __ffs(start)); > + order = min(MAX_ORDER - 1UL, __ffs(end - start)); > > while (start + (1UL << order) > end) > order--; ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CGME20201204160751eucas1p13cc7aad8c68dd2a495c4bbf422c4228c@eucas1p1.samsung.com>]
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages [not found] ` <CGME20201204160751eucas1p13cc7aad8c68dd2a495c4bbf422c4228c@eucas1p1.samsung.com> @ 2020-12-04 16:07 ` Marek Szyprowski 2020-12-04 17:43 ` Jon Hunter 0 siblings, 1 reply; 8+ messages in thread From: Marek Szyprowski @ 2020-12-04 16:07 UTC (permalink / raw) To: Qian Cai, carver4lio, rppt Cc: akpm, linux-mm, linux-kernel, Hailong Liu, Stephen Rothwell, Linux Next Mailing List, Bartlomiej Zolnierkiewicz Hi All, On 04.12.2020 14:42, Qian Cai wrote: > On Thu, 2020-12-03 at 23:23 +0800, carver4lio@163.com wrote: >> From: Hailong Liu <liu.hailong6@zte.com.cn> >> >> When system in the booting stage, pages span from [start, end] of a memblock >> are freed to buddy in a order as large as possible (less than MAX_ORDER) at >> first, then decrease gradually to a proper order(less than end) in a loop. >> >> However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order >> in some cases. >> Instead, *__ffs(end - start)* may be more appropriate and meaningful. >> >> Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> > Reverting this commit on the top of today's linux-next fixed boot crashes on > multiple NUMA systems. I confirm. Reverting commit 4df001639c84 ("mm/memblock: use a more appropriate order calculation when free memblock pages") on top of linux next-20201204 fixed booting of my ARM32bit test systems. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages 2020-12-04 16:07 ` Marek Szyprowski @ 2020-12-04 17:43 ` Jon Hunter 2020-12-05 17:09 ` Anders Roxell 0 siblings, 1 reply; 8+ messages in thread From: Jon Hunter @ 2020-12-04 17:43 UTC (permalink / raw) To: Marek Szyprowski, Qian Cai, carver4lio, rppt Cc: akpm, linux-mm, linux-kernel, Hailong Liu, Stephen Rothwell, Linux Next Mailing List, Bartlomiej Zolnierkiewicz, linux-tegra On 04/12/2020 16:07, Marek Szyprowski wrote: > Hi All, > > On 04.12.2020 14:42, Qian Cai wrote: >> On Thu, 2020-12-03 at 23:23 +0800, carver4lio@163.com wrote: >>> From: Hailong Liu <liu.hailong6@zte.com.cn> >>> >>> When system in the booting stage, pages span from [start, end] of a memblock >>> are freed to buddy in a order as large as possible (less than MAX_ORDER) at >>> first, then decrease gradually to a proper order(less than end) in a loop. >>> >>> However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order >>> in some cases. >>> Instead, *__ffs(end - start)* may be more appropriate and meaningful. >>> >>> Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> >> Reverting this commit on the top of today's linux-next fixed boot crashes on >> multiple NUMA systems. > > I confirm. Reverting commit 4df001639c84 ("mm/memblock: use a more > appropriate order calculation when free memblock pages") on top of linux > next-20201204 fixed booting of my ARM32bit test systems. FWIW, I also confirm that this is causing several 32-bit Tegra platforms to crash on boot and reverting this fixes the problem. Jon -- nvpublic ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages 2020-12-04 17:43 ` Jon Hunter @ 2020-12-05 17:09 ` Anders Roxell 2020-12-05 17:12 ` Anders Roxell 0 siblings, 1 reply; 8+ messages in thread From: Anders Roxell @ 2020-12-05 17:09 UTC (permalink / raw) To: Jon Hunter Cc: Marek Szyprowski, Qian Cai, carver4lio, rppt, Andrew Morton, Linux-MM, Linux Kernel Mailing List, Hailong Liu, Stephen Rothwell, Linux Next Mailing List, Bartlomiej Zolnierkiewicz, linux-tegra On Fri, 4 Dec 2020 at 18:44, Jon Hunter <jonathanh@nvidia.com> wrote: > > > On 04/12/2020 16:07, Marek Szyprowski wrote: > > Hi All, > > > > On 04.12.2020 14:42, Qian Cai wrote: > >> On Thu, 2020-12-03 at 23:23 +0800, carver4lio@163.com wrote: > >>> From: Hailong Liu <liu.hailong6@zte.com.cn> > >>> > >>> When system in the booting stage, pages span from [start, end] of a memblock > >>> are freed to buddy in a order as large as possible (less than MAX_ORDER) at > >>> first, then decrease gradually to a proper order(less than end) in a loop. > >>> > >>> However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order > >>> in some cases. > >>> Instead, *__ffs(end - start)* may be more appropriate and meaningful. > >>> > >>> Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> > >> Reverting this commit on the top of today's linux-next fixed boot crashes on > >> multiple NUMA systems. > > > > I confirm. Reverting commit 4df001639c84 ("mm/memblock: use a more > > appropriate order calculation when free memblock pages") on top of linux > > next-20201204 fixed booting of my ARM32bit test systems. > > > FWIW, I also confirm that this is causing several 32-bit Tegra platforms > to crash on boot and reverting this fixes the problem. I had the same experience on an arm64 system. Cheers, Anders ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages 2020-12-05 17:09 ` Anders Roxell @ 2020-12-05 17:12 ` Anders Roxell 0 siblings, 0 replies; 8+ messages in thread From: Anders Roxell @ 2020-12-05 17:12 UTC (permalink / raw) To: Jon Hunter Cc: Marek Szyprowski, Qian Cai, carver4lio, rppt, Andrew Morton, Linux-MM, Linux Kernel Mailing List, Hailong Liu, Stephen Rothwell, Linux Next Mailing List, Bartlomiej Zolnierkiewicz, linux-tegra On Sat, 5 Dec 2020 at 18:09, Anders Roxell <anders.roxell@linaro.org> wrote: > > On Fri, 4 Dec 2020 at 18:44, Jon Hunter <jonathanh@nvidia.com> wrote: > > > > > > On 04/12/2020 16:07, Marek Szyprowski wrote: > > > Hi All, > > > > > > On 04.12.2020 14:42, Qian Cai wrote: > > >> On Thu, 2020-12-03 at 23:23 +0800, carver4lio@163.com wrote: > > >>> From: Hailong Liu <liu.hailong6@zte.com.cn> > > >>> > > >>> When system in the booting stage, pages span from [start, end] of a memblock > > >>> are freed to buddy in a order as large as possible (less than MAX_ORDER) at > > >>> first, then decrease gradually to a proper order(less than end) in a loop. > > >>> > > >>> However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order > > >>> in some cases. > > >>> Instead, *__ffs(end - start)* may be more appropriate and meaningful. > > >>> > > >>> Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> > > >> Reverting this commit on the top of today's linux-next fixed boot crashes on > > >> multiple NUMA systems. > > > > > > I confirm. Reverting commit 4df001639c84 ("mm/memblock: use a more > > > appropriate order calculation when free memblock pages") on top of linux > > > next-20201204 fixed booting of my ARM32bit test systems. > > > > > > FWIW, I also confirm that this is causing several 32-bit Tegra platforms > > to crash on boot and reverting this fixes the problem. > > I had the same experience on an arm64 system. This is the log that I see: [ 0.000000][ T0] percpu: Embedded 507 pages/cpu s2036568 r8192 d31912 u2076672 [ 0.000000][ T0] Detected VIPT I-cache on CPU0 [ 0.000000][ T0] CPU features: detected: ARM erratum 845719 [ 0.000000][ T0] CPU features: GIC system register CPU interface present but disabled by higher exception level [ 0.000000][ T0] CPU features: kernel page table isolation forced OFF by kpti command line option [ 0.000000][ T0] Built 1 zonelists, mobility grouping on. Total pages: 516096 [ 0.000000][ T0] Policy zone: DMA [ 0.000000][ T0] Kernel command line: root=/dev/root rootfstype=9p rootflags=trans=virtio console=ttyAMA0,38400n8 earlycon=pl011,0x9000000 initcall_debug softlockup_panic=0 security=none kpti=no [ 0.000000][ T0] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes, linear) [ 0.000000][ T0] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes, linear) [ 0.000000][ T0] mem auto-init: stack:off, heap alloc:on, heap free:on [ 0.000000][ T0] mem auto-init: clearing system memory may take some time... [ 0.000000][ T0] page:(____ptrval____) refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x40010 [ 0.000000][ T0] flags: 0x1fffe0000000000() [ 0.000000][ T0] raw: 01fffe0000000000 fffffc0000000408 fffffc0000000408 0000000000000000 [ 0.000000][ T0] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 0.000000][ T0] page dumped because: VM_BUG_ON_PAGE(pfn & ((1 << order) - 1)) [ 0.000000][ T0] ------------[ cut here ]------------ [ 0.000000][ T0] kernel BUG at mm/page_alloc.c:1015! [ 0.000000][ T0] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 0.000000][ T0] Modules linked in: [ 0.000000][ T0] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0-rc6-next-20201204-00010-g7f8e9106f747-dirty #1 [ 0.000000][ T0] Hardware name: linux,dummy-virt (DT) [ 0.000000][ T0] pstate: 40400085 (nZcv daIf +PAN -UAO -TCO BTYPE=--) [ 0.000000][ T0] pc : __free_one_page+0x14c/0x700 [ 0.000000][ T0] lr : __free_one_page+0x14c/0x700 [ 0.000000][ T0] sp : ffff800013fd7c10 [ 0.000000][ T0] x29: ffff800013fd7c10 x28: 0000000000000000 [ 0.000000][ T0] x27: 0000000000000200 x26: 0000000000000001 [ 0.000000][ T0] x25: 0000000000000000 x24: 0000000000000009 [ 0.000000][ T0] x23: ffff00007dbfbd40 x22: fffffc0000000400 [ 0.000000][ T0] x21: 0000000000040010 x20: 0000000000000009 [ 0.000000][ T0] x19: 00000000000001ff x18: 0000000000000000 [ 0.000000][ T0] x17: 0000000000000000 x16: 0000000000000000 [ 0.000000][ T0] x15: 0000000000000000 x14: 0000000000000000 [ 0.000000][ T0] x13: 0000000000000000 x12: ffff70000281852d [ 0.000000][ T0] x11: 1ffff0000281852c x10: ffff70000281852c [ 0.000000][ T0] x9 : dfff800000000000 x8 : ffff8000140c2960 [ 0.000000][ T0] x7 : 0000000000000001 x6 : 00008ffffd7e7ad4 [ 0.000000][ T0] x5 : 0000000000000000 x4 : 0000000000000000 [ 0.000000][ T0] x3 : ffff80001400ab00 x2 : 0000000000000000 [ 0.000000][ T0] x1 : 0000000000000000 x0 : 0000000000000000 [ 0.000000][ T0] Call trace: [ 0.000000][ T0] __free_one_page+0x14c/0x700 [ 0.000000][ T0] free_one_page+0xf0/0x120 [ 0.000000][ T0] __free_pages_ok+0x720/0x780 [ 0.000000][ T0] __free_pages_core+0x240/0x280 [ 0.000000][ T0] memblock_free_pages+0x40/0x50 [ 0.000000][ T0] free_low_memory_core_early+0x230/0x2f0 [ 0.000000][ T0] memblock_free_all+0x28/0x58 [ 0.000000][ T0] mem_init+0xf0/0x10c [ 0.000000][ T0] mm_init+0xb4/0xe8 [ 0.000000][ T0] start_kernel+0x1e0/0x520 [ 0.000000][ T0] Code: 913a8021 aa1603e0 91030021 97fe7ec6 (d4210000) [ 0.000000][ T0] random: get_random_bytes called from oops_exit+0x50/0xa0 with crng_init=0 [ 0.000000][ T0] ---[ end trace 0000000000000000 ]--- [ 0.000000][ T0] Kernel panic - not syncing: Oops - BUG: Fatal exception [ 0.000000][ T0] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]--- Cheers, Anders ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages 2020-12-03 15:23 [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages carver4lio 2020-12-04 13:42 ` Qian Cai @ 2020-12-06 11:55 ` Mike Rapoport 2020-12-06 14:21 ` carver4lio 1 sibling, 1 reply; 8+ messages in thread From: Mike Rapoport @ 2020-12-06 11:55 UTC (permalink / raw) To: carver4lio; +Cc: akpm, linux-mm, linux-kernel, Hailong Liu On Thu, Dec 03, 2020 at 11:23:10PM +0800, carver4lio@163.com wrote: > From: Hailong Liu <liu.hailong6@zte.com.cn> > > When system in the booting stage, pages span from [start, end] of a memblock > are freed to buddy in a order as large as possible (less than MAX_ORDER) at > first, then decrease gradually to a proper order(less than end) in a loop. > > However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order > in some cases. Do you have examples? What is the memory configration that casues suboptimal order selection and what is the order in this case? > Instead, *__ffs(end - start)* may be more appropriate and meaningful. As several people reported using __ffs(end - start) is not correct. If the order selection is indeed suboptimal we'd need some better formula ;-) > Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> > --- > mm/memblock.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index b68ee8678..7c6d0dde7 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -1931,7 +1931,7 @@ static void __init __free_pages_memory(unsigned long start, unsigned long end) > int order; > > while (start < end) { > - order = min(MAX_ORDER - 1UL, __ffs(start)); > + order = min(MAX_ORDER - 1UL, __ffs(end - start)); > > while (start + (1UL << order) > end) > order--; > -- > 2.17.1 > > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages 2020-12-06 11:55 ` Mike Rapoport @ 2020-12-06 14:21 ` carver4lio 0 siblings, 0 replies; 8+ messages in thread From: carver4lio @ 2020-12-06 14:21 UTC (permalink / raw) To: Mike Rapoport; +Cc: akpm, linux-mm, linux-kernel, Hailong Liu On 12/6/20 7:55 PM, Mike Rapoport wrote: > On Thu, Dec 03, 2020 at 11:23:10PM +0800, carver4lio@163.com wrote: >> From: Hailong Liu <liu.hailong6@zte.com.cn> >> >> When system in the booting stage, pages span from [start, end] of a memblock >> are freed to buddy in a order as large as possible (less than MAX_ORDER) at >> first, then decrease gradually to a proper order(less than end) in a loop. >> >> However, *min(MAX_ORDER - 1UL, __ffs(start))* can not get the largest order >> in some cases. > > Do you have examples? > What is the memory configration that casues suboptimal order selection > and what is the order in this case? > I'm sorry for my careless and inadequate testing(I just test it on my x86 machine with 8 cores). On my x86_64 machine, the layout of RAM looks like: / # cat /proc/iomem 00000100-00000fff : reserved 00001000-0009c7ff : System RAM 0009c800-0009ffff : reserved ..... 100000000-22dffffff : System RAM 22c600000-22d0e01c0 : Kernel code 22d0e01c1-22d96af3f : Kernel data 22dae5000-22dbdcfff : Kernel bss 22e000000-22fffffff : RAM buffer On my machine, I noticed that when the order of an start pfn in is less than MAX_ORDER, e.g: the start phy_addr 0x00001000, then the return value *order* of *min(MAX_ORDER - 1UL, __ffs(start))* will be 1, but the free pages span of the memblock is more than order 1, it's should be (end - start), I guess. I tested my ideas with some record code like this: diff --git a/mm/memblock.c b/mm/memblock.c index b68ee86788af..b0143e3f75db 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1928,18 +1928,23 @@ early_param("memblock", early_memblock); static void __init __free_pages_memory(unsigned long start, unsigned long end) { - int order; + int order, loop_cnt, adjust_cnt; + while (start < end) { order = min(MAX_ORDER - 1UL, __ffs(start)); - while (start + (1UL << order) > end) + while (start + (1UL << order) > end) { order--; - + adjust_cnt++; + } memblock_free_pages(pfn_to_page(start), start, order); start += (1UL << order); + loop_cnt++; } + pr_info("TST:[start %lu, end %lu]: loop cnt %d, adjust cnt %d\n", + loop_cnt++, adjust_cnt++); } If I change __ffs(start) to __ffs(end - start), the print info show less loop_cnt and adjust_cnt on my machine. >> Instead, *__ffs(end - start)* may be more appropriate and meaningful. > > As several people reported using __ffs(end - start) is not correct. > If the order selection is indeed suboptimal we'd need some better > formula ;-) > >> Signed-off-by: Hailong Liu <liu.hailong6@zte.com.cn> >> --- >> mm/memblock.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/mm/memblock.c b/mm/memblock.c >> index b68ee8678..7c6d0dde7 100644 >> --- a/mm/memblock.c >> +++ b/mm/memblock.c >> @@ -1931,7 +1931,7 @@ static void __init __free_pages_memory(unsigned long start, unsigned long end) >> int order; >> >> while (start < end) { >> - order = min(MAX_ORDER - 1UL, __ffs(start)); >> + order = min(MAX_ORDER - 1UL, __ffs(end - start)); >> >> while (start + (1UL << order) > end) >> order--; >> -- >> 2.17.1 >> >> > ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-12-06 14:21 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-03 15:23 [PATCH] mm/memblock:use a more appropriate order calculation when free memblock pages carver4lio 2020-12-04 13:42 ` Qian Cai [not found] ` <CGME20201204160751eucas1p13cc7aad8c68dd2a495c4bbf422c4228c@eucas1p1.samsung.com> 2020-12-04 16:07 ` Marek Szyprowski 2020-12-04 17:43 ` Jon Hunter 2020-12-05 17:09 ` Anders Roxell 2020-12-05 17:12 ` Anders Roxell 2020-12-06 11:55 ` Mike Rapoport 2020-12-06 14:21 ` carver4lio
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).