* [PATCH v3 0/3] Enable >0 order folio memory compaction @ 2024-02-02 16:15 Zi Yan 2024-02-02 16:15 ` [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios Zi Yan ` (4 more replies) 0 siblings, 5 replies; 21+ messages in thread From: Zi Yan @ 2024-02-02 16:15 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Zi Yan, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Vlastimil Babka, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) From: Zi Yan <ziy@nvidia.com> Hi all, This patchset enables >0 order folio memory compaction, which is one of the prerequisitions for large folio support[1]. It includes the fix[4] for V2 and is on top of mm-everything-2024-01-29-07-19. I am aware of that split free pages is necessary for folio migration in compaction, since if >0 order free pages are never split and no order-0 free page is scanned, compaction will end prematurely due to migration returns -ENOMEM. Free page split becomes a must instead of an optimization. lkp ncompare results for default LRU (-no-mglru) and CONFIG_LRU_GEN are shown at the bottom (on a 8-CPU (Intel Xeon E5-2650 v4 @ 2.20GHz) 16G VM). In sum, most of vm-scalability applications do not see performance change, and the others see ~4% to ~26% performance boost under default LRU and ~2% to ~6% performance boost under CONFIG_LRU_GEN. Changelog === From V2 [3]: 1. Added missing free page count in fast isolation path. This fixed the weird performance outcome. From V1 [2]: 1. Used folio_test_large() instead of folio_order() > 0. (per Matthew Wilcox) 2. Fixed code rebase error. (per Baolin Wang) 3. Used list_split_init() instead of list_split(). (per Ryan Boberts) 4. Added free_pages_prepare_fpi_none() to avoid duplicate free page code in compaction_free(). 5. Dropped source page order sorting patch. From RFC [1]: 1. Enabled >0 order folio compaction in the first patch by splitting all to-be-migrated folios. (per Huang, Ying) 2. Stopped isolating compound pages with order greater than cc->order to avoid wasting effort, since cc->order gives a hint that no free pages with order greater than it exist, thus migrating the compound pages will fail. (per Baolin Wang) 3. Retained the folio check within lru lock. (per Baolin Wang) 4. Made isolate_freepages_block() generate order-sorted multi lists. (per Johannes Weiner) Overview === To support >0 order folio compaction, the patchset changes how free pages used for migration are kept during compaction. Free pages used to be split into order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared, page order stored in page->private is zeroed, and page reference is set to 1). Now all free pages are kept in a MAX_ORDER+1 array of page lists based on their order without post allocation process. When migrate_pages() asks for a new page, one of the free pages, based on the requested page order, is then processed and given out. Feel free to give comments and ask questions. Thanks. vm-scalability results on CONFIG_LRU_GEN === ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/mmap-xread-seq-mt/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19+ 6.8.0-rc1-split-folio-in-compaction+ 6.8.0-rc1-folio-migration-in-compaction+ 6.8.0-rc1-folio-migration-free-page-split+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 15107616 +3.2% 15590339 +1.3% 15297619 +3.0% 15567998 vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19+ 6.8.0-rc1-split-folio-in-compaction+ 6.8.0-rc1-folio-migration-in-compaction+ 6.8.0-rc1-folio-migration-free-page-split+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 12611785 +1.8% 12832919 +0.9% 12724223 +1.6% 12812682 vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19+ 6.8.0-rc1-split-folio-in-compaction+ 6.8.0-rc1-folio-migration-in-compaction+ 6.8.0-rc1-folio-migration-free-page-split+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 9833393 +5.7% 10390190 +3.0% 10126606 +5.9% 10408804 vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19+ 6.8.0-rc1-split-folio-in-compaction+ 6.8.0-rc1-folio-migration-in-compaction+ 6.8.0-rc1-folio-migration-free-page-split+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 7034709 ± 3% +2.9% 7241429 +3.2% 7256680 ± 2% +3.9% 7308375 vm-scalability.throughput vm-scalability results on default LRU (with -no-mglru suffix) === ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/mmap-xread-seq-mt/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+ 6.8.0-rc1-split-folio-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-free-page-split-no-mglru+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 14401491 +3.7% 14940270 +2.4% 14748626 +4.0% 14975716 vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+ 6.8.0-rc1-split-folio-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-free-page-split-no-mglru+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 11407497 +5.1% 11989632 -0.5% 11349272 +4.8% 11957423 vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq-mt/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+ 6.8.0-rc1-split-folio-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-free-page-split-no-mglru+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 11348474 +3.3% 11719453 -1.2% 11208759 +3.7% 11771926 vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+ 6.8.0-rc1-split-folio-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-free-page-split-no-mglru+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 8065614 ± 3% +7.7% 8686626 ± 2% +5.0% 8467577 ± 4% +11.8% 9016077 ± 2% vm-scalability.throughput ========================================================================================= compiler/kconfig/rootfs/runtime/tbox_group/test/testcase: gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability commit: 6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+ 6.8.0-rc1-split-folio-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-in-compaction-no-mglru+ 6.8.0-rc1-folio-migration-free-page-split-no-mglru+ 6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f ---------------- --------------------------- --------------------------- --------------------------- %stddev %change %stddev %change %stddev %change %stddev \ | \ | \ | \ 6438422 ± 2% +27.5% 8206734 ± 2% +10.6% 7118390 +26.2% 8127192 ± 4% vm-scalability.throughput [1] https://lore.kernel.org/linux-mm/20230912162815.440749-1-zi.yan@sent.com/ [2] https://lore.kernel.org/linux-mm/20231113170157.280181-1-zi.yan@sent.com/ [3] https://lore.kernel.org/linux-mm/20240123034636.1095672-1-zi.yan@sent.com/ [4] https://lore.kernel.org/linux-mm/23BA8CC1-1014-4D09-9C33-938638E13C01@nvidia.com/ Zi Yan (3): mm/compaction: enable compacting >0 order folios. mm/compaction: add support for >0 order folio memory compaction. mm/compaction: optimize >0 order folio compaction with free page split. mm/compaction.c | 219 ++++++++++++++++++++++++++++++++++-------------- mm/internal.h | 9 +- mm/page_alloc.c | 6 ++ 3 files changed, 170 insertions(+), 64 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios. 2024-02-02 16:15 [PATCH v3 0/3] Enable >0 order folio memory compaction Zi Yan @ 2024-02-02 16:15 ` Zi Yan 2024-02-09 14:32 ` Vlastimil Babka 2024-02-02 16:15 ` [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction Zi Yan ` (3 subsequent siblings) 4 siblings, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-02 16:15 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Zi Yan, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Vlastimil Babka, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) From: Zi Yan <ziy@nvidia.com> migrate_pages() supports >0 order folio migration and during compaction, even if compaction_alloc() cannot provide >0 order free pages, migrate_pages() can split the source page and try to migrate the base pages from the split. It can be a baseline and start point for adding support for compacting >0 order folios. Suggested-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Zi Yan <ziy@nvidia.com> --- mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 35 insertions(+), 8 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 4add68d40e8d..e43e898d2c77 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) return too_many; } +/* + * 1. if the page order is larger than or equal to target_order (i.e., + * cc->order and when it is not -1 for global compaction), skip it since + * target_order already indicates no free page with larger than target_order + * exists and later migrating it will most likely fail; + * + * 2. compacting > pageblock_order pages does not improve memory fragmentation, + * skip them; + */ +static bool skip_isolation_on_order(int order, int target_order) +{ + return (target_order != -1 && order >= target_order) || + order >= pageblock_order; +} + /** * isolate_migratepages_block() - isolate all migrate-able pages within * a single pageblock @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, /* * Regardless of being on LRU, compound pages such as THP and * hugetlbfs are not to be compacted unless we are attempting - * an allocation much larger than the huge page size (eg CMA). + * an allocation larger than the compound page size. * We can potentially save a lot of iterations if we skip them * at once. The check is racy, but we can consider only valid * values and the only danger is skipping too much. @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (PageCompound(page) && !cc->alloc_contig) { const unsigned int order = compound_order(page); - if (likely(order <= MAX_PAGE_ORDER)) { - low_pfn += (1UL << order) - 1; - nr_scanned += (1UL << order) - 1; + /* + * Skip based on page order and compaction target order + * and skip hugetlbfs pages. + */ + if (skip_isolation_on_order(order, cc->order) || + PageHuge(page)) { + if (order <= MAX_PAGE_ORDER) { + low_pfn += (1UL << order) - 1; + nr_scanned += (1UL << order) - 1; + } + goto isolate_fail; } - goto isolate_fail; } /* @@ -1165,10 +1187,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, } /* - * folio become large since the non-locked check, - * and it's on LRU. + * Check LRU folio order under the lock */ - if (unlikely(folio_test_large(folio) && !cc->alloc_contig)) { + if (unlikely(skip_isolation_on_order(folio_order(folio), + cc->order) && + !cc->alloc_contig)) { low_pfn += folio_nr_pages(folio) - 1; nr_scanned += folio_nr_pages(folio) - 1; folio_set_lru(folio); @@ -1786,6 +1809,10 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) struct compact_control *cc = (struct compact_control *)data; struct folio *dst; + /* this makes migrate_pages() split the source page and retry */ + if (folio_test_large(src) > 0) + return NULL; + if (list_empty(&cc->freepages)) { isolate_freepages(cc); -- 2.43.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios. 2024-02-02 16:15 ` [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios Zi Yan @ 2024-02-09 14:32 ` Vlastimil Babka 2024-02-09 19:25 ` Zi Yan 0 siblings, 1 reply; 21+ messages in thread From: Vlastimil Babka @ 2024-02-09 14:32 UTC (permalink / raw) To: Zi Yan, linux-mm, linux-kernel Cc: Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/2/24 17:15, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > migrate_pages() supports >0 order folio migration and during compaction, > even if compaction_alloc() cannot provide >0 order free pages, > migrate_pages() can split the source page and try to migrate the base pages > from the split. It can be a baseline and start point for adding support for > compacting >0 order folios. > > Suggested-by: Huang Ying <ying.huang@intel.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- > mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- > 1 file changed, 35 insertions(+), 8 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 4add68d40e8d..e43e898d2c77 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) > return too_many; > } > > +/* > + * 1. if the page order is larger than or equal to target_order (i.e., > + * cc->order and when it is not -1 for global compaction), skip it since > + * target_order already indicates no free page with larger than target_order > + * exists and later migrating it will most likely fail; > + * > + * 2. compacting > pageblock_order pages does not improve memory fragmentation, > + * skip them; > + */ > +static bool skip_isolation_on_order(int order, int target_order) > +{ > + return (target_order != -1 && order >= target_order) || > + order >= pageblock_order; > +} > + > /** > * isolate_migratepages_block() - isolate all migrate-able pages within > * a single pageblock > @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > /* > * Regardless of being on LRU, compound pages such as THP and > * hugetlbfs are not to be compacted unless we are attempting > - * an allocation much larger than the huge page size (eg CMA). > + * an allocation larger than the compound page size. > * We can potentially save a lot of iterations if we skip them > * at once. The check is racy, but we can consider only valid > * values and the only danger is skipping too much. > @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > if (PageCompound(page) && !cc->alloc_contig) { > const unsigned int order = compound_order(page); > > - if (likely(order <= MAX_PAGE_ORDER)) { > - low_pfn += (1UL << order) - 1; > - nr_scanned += (1UL << order) - 1; > + /* > + * Skip based on page order and compaction target order > + * and skip hugetlbfs pages. > + */ > + if (skip_isolation_on_order(order, cc->order) || > + PageHuge(page)) { Hm I'd try to avoid a new PageHuge() test here. Earlier we have a block that does if (PageHuge(page) && cc->alloc_contig) { ... think I'd rather rewrite it to handle the PageHuge() case completely and just make it skip the 1UL << order pages there for !cc->alloc_config. Even if it means duplicating a bit of the low_pfn and nr_scanned bumping code. Which reminds me the PageHuge() check there is probably still broken ATM: https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ Even better reason not to add another one. If the huge page materialized since the first check, we should bail out when testing PageLRU later anyway. > + if (order <= MAX_PAGE_ORDER) { > + low_pfn += (1UL << order) - 1; > + nr_scanned += (1UL << order) - 1; > + } > + goto isolate_fail; > } > - goto isolate_fail; > } > > /* > @@ -1165,10 +1187,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > } > > /* > - * folio become large since the non-locked check, > - * and it's on LRU. > + * Check LRU folio order under the lock > */ > - if (unlikely(folio_test_large(folio) && !cc->alloc_contig)) { > + if (unlikely(skip_isolation_on_order(folio_order(folio), > + cc->order) && > + !cc->alloc_contig)) { > low_pfn += folio_nr_pages(folio) - 1; > nr_scanned += folio_nr_pages(folio) - 1; > folio_set_lru(folio); > @@ -1786,6 +1809,10 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) > struct compact_control *cc = (struct compact_control *)data; > struct folio *dst; > > + /* this makes migrate_pages() split the source page and retry */ > + if (folio_test_large(src) > 0) > + return NULL; > + > if (list_empty(&cc->freepages)) { > isolate_freepages(cc); > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios. 2024-02-09 14:32 ` Vlastimil Babka @ 2024-02-09 19:25 ` Zi Yan 2024-02-09 20:43 ` Vlastimil Babka 0 siblings, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-09 19:25 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 5711 bytes --] On 9 Feb 2024, at 9:32, Vlastimil Babka wrote: > On 2/2/24 17:15, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> migrate_pages() supports >0 order folio migration and during compaction, >> even if compaction_alloc() cannot provide >0 order free pages, >> migrate_pages() can split the source page and try to migrate the base pages >> from the split. It can be a baseline and start point for adding support for >> compacting >0 order folios. >> >> Suggested-by: Huang Ying <ying.huang@intel.com> >> Signed-off-by: Zi Yan <ziy@nvidia.com> >> --- >> mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- >> 1 file changed, 35 insertions(+), 8 deletions(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index 4add68d40e8d..e43e898d2c77 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) >> return too_many; >> } >> >> +/* >> + * 1. if the page order is larger than or equal to target_order (i.e., >> + * cc->order and when it is not -1 for global compaction), skip it since >> + * target_order already indicates no free page with larger than target_order >> + * exists and later migrating it will most likely fail; >> + * >> + * 2. compacting > pageblock_order pages does not improve memory fragmentation, >> + * skip them; >> + */ >> +static bool skip_isolation_on_order(int order, int target_order) >> +{ >> + return (target_order != -1 && order >= target_order) || >> + order >= pageblock_order; >> +} >> + >> /** >> * isolate_migratepages_block() - isolate all migrate-able pages within >> * a single pageblock >> @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> /* >> * Regardless of being on LRU, compound pages such as THP and >> * hugetlbfs are not to be compacted unless we are attempting >> - * an allocation much larger than the huge page size (eg CMA). >> + * an allocation larger than the compound page size. >> * We can potentially save a lot of iterations if we skip them >> * at once. The check is racy, but we can consider only valid >> * values and the only danger is skipping too much. >> @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> if (PageCompound(page) && !cc->alloc_contig) { >> const unsigned int order = compound_order(page); >> >> - if (likely(order <= MAX_PAGE_ORDER)) { >> - low_pfn += (1UL << order) - 1; >> - nr_scanned += (1UL << order) - 1; >> + /* >> + * Skip based on page order and compaction target order >> + * and skip hugetlbfs pages. >> + */ >> + if (skip_isolation_on_order(order, cc->order) || >> + PageHuge(page)) { > > Hm I'd try to avoid a new PageHuge() test here. > > Earlier we have a block that does > if (PageHuge(page) && cc->alloc_contig) { > ... > > think I'd rather rewrite it to handle the PageHuge() case completely and > just make it skip the 1UL << order pages there for !cc->alloc_config. Even > if it means duplicating a bit of the low_pfn and nr_scanned bumping code. > > Which reminds me the PageHuge() check there is probably still broken ATM: > > https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ > > Even better reason not to add another one. > If the huge page materialized since the first check, we should bail out when > testing PageLRU later anyway. OK, so basically something like: if (PageHuge(page)) { if (cc->alloc_contig) { // existing code for PageHuge(page) && cc->allc_contig } else { const unsigned int order = compound_order(page); if (order <= MAX_PAGE_ORDER) { low_pfn += (1UL << order) - 1; nr_scanned += (1UL << order) - 1; } goto isolate_fail; } } ... if (PageCompound(page) && !cc->alloc_contig) { const unsigned int order = compound_order(page); /* Skip based on page order and compaction target order. */ if (skip_isolation_on_order(order, cc->order)) { if (order <= MAX_PAGE_ORDER) { low_pfn += (1UL << order) - 1; nr_scanned += (1UL << order) - 1; } goto isolate_fail; } } > >> + if (order <= MAX_PAGE_ORDER) { >> + low_pfn += (1UL << order) - 1; >> + nr_scanned += (1UL << order) - 1; >> + } >> + goto isolate_fail; >> } >> - goto isolate_fail; >> } >> >> /* >> @@ -1165,10 +1187,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> } >> >> /* >> - * folio become large since the non-locked check, >> - * and it's on LRU. >> + * Check LRU folio order under the lock >> */ >> - if (unlikely(folio_test_large(folio) && !cc->alloc_contig)) { >> + if (unlikely(skip_isolation_on_order(folio_order(folio), >> + cc->order) && >> + !cc->alloc_contig)) { >> low_pfn += folio_nr_pages(folio) - 1; >> nr_scanned += folio_nr_pages(folio) - 1; >> folio_set_lru(folio); >> @@ -1786,6 +1809,10 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >> struct compact_control *cc = (struct compact_control *)data; >> struct folio *dst; >> >> + /* this makes migrate_pages() split the source page and retry */ >> + if (folio_test_large(src) > 0) >> + return NULL; >> + >> if (list_empty(&cc->freepages)) { >> isolate_freepages(cc); >> -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios. 2024-02-09 19:25 ` Zi Yan @ 2024-02-09 20:43 ` Vlastimil Babka 2024-02-09 20:44 ` Zi Yan 0 siblings, 1 reply; 21+ messages in thread From: Vlastimil Babka @ 2024-02-09 20:43 UTC (permalink / raw) To: Zi Yan Cc: linux-mm, linux-kernel, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/9/24 20:25, Zi Yan wrote: > On 9 Feb 2024, at 9:32, Vlastimil Babka wrote: > >> On 2/2/24 17:15, Zi Yan wrote: >>> From: Zi Yan <ziy@nvidia.com> >>> >>> migrate_pages() supports >0 order folio migration and during compaction, >>> even if compaction_alloc() cannot provide >0 order free pages, >>> migrate_pages() can split the source page and try to migrate the base pages >>> from the split. It can be a baseline and start point for adding support for >>> compacting >0 order folios. >>> >>> Suggested-by: Huang Ying <ying.huang@intel.com> >>> Signed-off-by: Zi Yan <ziy@nvidia.com> >>> --- >>> mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- >>> 1 file changed, 35 insertions(+), 8 deletions(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index 4add68d40e8d..e43e898d2c77 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) >>> return too_many; >>> } >>> >>> +/* >>> + * 1. if the page order is larger than or equal to target_order (i.e., >>> + * cc->order and when it is not -1 for global compaction), skip it since >>> + * target_order already indicates no free page with larger than target_order >>> + * exists and later migrating it will most likely fail; >>> + * >>> + * 2. compacting > pageblock_order pages does not improve memory fragmentation, >>> + * skip them; >>> + */ >>> +static bool skip_isolation_on_order(int order, int target_order) >>> +{ >>> + return (target_order != -1 && order >= target_order) || >>> + order >= pageblock_order; >>> +} >>> + >>> /** >>> * isolate_migratepages_block() - isolate all migrate-able pages within >>> * a single pageblock >>> @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>> /* >>> * Regardless of being on LRU, compound pages such as THP and >>> * hugetlbfs are not to be compacted unless we are attempting >>> - * an allocation much larger than the huge page size (eg CMA). >>> + * an allocation larger than the compound page size. >>> * We can potentially save a lot of iterations if we skip them >>> * at once. The check is racy, but we can consider only valid >>> * values and the only danger is skipping too much. >>> @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>> if (PageCompound(page) && !cc->alloc_contig) { >>> const unsigned int order = compound_order(page); >>> >>> - if (likely(order <= MAX_PAGE_ORDER)) { >>> - low_pfn += (1UL << order) - 1; >>> - nr_scanned += (1UL << order) - 1; >>> + /* >>> + * Skip based on page order and compaction target order >>> + * and skip hugetlbfs pages. >>> + */ >>> + if (skip_isolation_on_order(order, cc->order) || >>> + PageHuge(page)) { >> >> Hm I'd try to avoid a new PageHuge() test here. >> >> Earlier we have a block that does >> if (PageHuge(page) && cc->alloc_contig) { >> ... >> >> think I'd rather rewrite it to handle the PageHuge() case completely and >> just make it skip the 1UL << order pages there for !cc->alloc_config. Even >> if it means duplicating a bit of the low_pfn and nr_scanned bumping code. >> >> Which reminds me the PageHuge() check there is probably still broken ATM: >> >> https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ >> >> Even better reason not to add another one. >> If the huge page materialized since the first check, we should bail out when >> testing PageLRU later anyway. > > > OK, so basically something like: > > if (PageHuge(page)) { > if (cc->alloc_contig) { Yeah but I'd handle the !cc->alloc_contig first as that ends with a goto, and then the rest doesn't need to be "} else { ... }" with extra identation > // existing code for PageHuge(page) && cc->allc_contig > } else { > const unsigned int order = compound_order(page); > > if (order <= MAX_PAGE_ORDER) { > low_pfn += (1UL << order) - 1; > nr_scanned += (1UL << order) - 1; > } > goto isolate_fail; > } > } ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios. 2024-02-09 20:43 ` Vlastimil Babka @ 2024-02-09 20:44 ` Zi Yan 0 siblings, 0 replies; 21+ messages in thread From: Zi Yan @ 2024-02-09 20:44 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 4118 bytes --] On 9 Feb 2024, at 15:43, Vlastimil Babka wrote: > On 2/9/24 20:25, Zi Yan wrote: >> On 9 Feb 2024, at 9:32, Vlastimil Babka wrote: >> >>> On 2/2/24 17:15, Zi Yan wrote: >>>> From: Zi Yan <ziy@nvidia.com> >>>> >>>> migrate_pages() supports >0 order folio migration and during compaction, >>>> even if compaction_alloc() cannot provide >0 order free pages, >>>> migrate_pages() can split the source page and try to migrate the base pages >>>> from the split. It can be a baseline and start point for adding support for >>>> compacting >0 order folios. >>>> >>>> Suggested-by: Huang Ying <ying.huang@intel.com> >>>> Signed-off-by: Zi Yan <ziy@nvidia.com> >>>> --- >>>> mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- >>>> 1 file changed, 35 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>> index 4add68d40e8d..e43e898d2c77 100644 >>>> --- a/mm/compaction.c >>>> +++ b/mm/compaction.c >>>> @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) >>>> return too_many; >>>> } >>>> >>>> +/* >>>> + * 1. if the page order is larger than or equal to target_order (i.e., >>>> + * cc->order and when it is not -1 for global compaction), skip it since >>>> + * target_order already indicates no free page with larger than target_order >>>> + * exists and later migrating it will most likely fail; >>>> + * >>>> + * 2. compacting > pageblock_order pages does not improve memory fragmentation, >>>> + * skip them; >>>> + */ >>>> +static bool skip_isolation_on_order(int order, int target_order) >>>> +{ >>>> + return (target_order != -1 && order >= target_order) || >>>> + order >= pageblock_order; >>>> +} >>>> + >>>> /** >>>> * isolate_migratepages_block() - isolate all migrate-able pages within >>>> * a single pageblock >>>> @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>>> /* >>>> * Regardless of being on LRU, compound pages such as THP and >>>> * hugetlbfs are not to be compacted unless we are attempting >>>> - * an allocation much larger than the huge page size (eg CMA). >>>> + * an allocation larger than the compound page size. >>>> * We can potentially save a lot of iterations if we skip them >>>> * at once. The check is racy, but we can consider only valid >>>> * values and the only danger is skipping too much. >>>> @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>>> if (PageCompound(page) && !cc->alloc_contig) { >>>> const unsigned int order = compound_order(page); >>>> >>>> - if (likely(order <= MAX_PAGE_ORDER)) { >>>> - low_pfn += (1UL << order) - 1; >>>> - nr_scanned += (1UL << order) - 1; >>>> + /* >>>> + * Skip based on page order and compaction target order >>>> + * and skip hugetlbfs pages. >>>> + */ >>>> + if (skip_isolation_on_order(order, cc->order) || >>>> + PageHuge(page)) { >>> >>> Hm I'd try to avoid a new PageHuge() test here. >>> >>> Earlier we have a block that does >>> if (PageHuge(page) && cc->alloc_contig) { >>> ... >>> >>> think I'd rather rewrite it to handle the PageHuge() case completely and >>> just make it skip the 1UL << order pages there for !cc->alloc_config. Even >>> if it means duplicating a bit of the low_pfn and nr_scanned bumping code. >>> >>> Which reminds me the PageHuge() check there is probably still broken ATM: >>> >>> https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ >>> >>> Even better reason not to add another one. >>> If the huge page materialized since the first check, we should bail out when >>> testing PageLRU later anyway. >> >> >> OK, so basically something like: >> >> if (PageHuge(page)) { >> if (cc->alloc_contig) { > > Yeah but I'd handle the !cc->alloc_contig first as that ends with a goto, > and then the rest doesn't need to be "} else { ... }" with extra identation OK. No problem. -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-02 16:15 [PATCH v3 0/3] Enable >0 order folio memory compaction Zi Yan 2024-02-02 16:15 ` [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios Zi Yan @ 2024-02-02 16:15 ` Zi Yan 2024-02-09 16:37 ` Vlastimil Babka 2024-02-02 16:15 ` [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan ` (2 subsequent siblings) 4 siblings, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-02 16:15 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Zi Yan, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Vlastimil Babka, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) From: Zi Yan <ziy@nvidia.com> Before last commit, memory compaction only migrates order-0 folios and skips >0 order folios. Last commit splits all >0 order folios during compaction. This commit migrates >0 order folios during compaction by keeping isolated free pages at their original size without splitting them into order-0 pages and using them directly during migration process. What is different from the prior implementation: 1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page lists, where each page list stores free pages in the same order. 2. All free pages are not post_alloc_hook() processed nor buddy pages, although their orders are stored in first page's private like buddy pages. 3. During migration, in new page allocation time (i.e., in compaction_alloc()), free pages are then processed by post_alloc_hook(). When migration fails and a new page is returned (i.e., in compaction_free()), free pages are restored by reversing the post_alloc_hook() operations using newly added free_pages_prepare_fpi_none(). Step 3 is done for a latter optimization that splitting and/or merging free pages during compaction becomes easier. Note: without splitting free pages, compaction can end prematurely due to migration will return -ENOMEM even if there is free pages. This happens when no order-0 free page exist and compaction_alloc() return NULL. Signed-off-by: Zi Yan <ziy@nvidia.com> --- mm/compaction.c | 149 +++++++++++++++++++++++++++++------------------- mm/internal.h | 9 ++- mm/page_alloc.c | 6 ++ 3 files changed, 104 insertions(+), 60 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index e43e898d2c77..58a4e3fb72ec 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -66,45 +66,67 @@ static inline void count_compact_events(enum vm_event_item item, long delta) #define COMPACTION_HPAGE_ORDER (PMD_SHIFT - PAGE_SHIFT) #endif -static unsigned long release_freepages(struct list_head *freelist) +static void init_page_list(struct page_list *p) { - struct page *page, *next; - unsigned long high_pfn = 0; - - list_for_each_entry_safe(page, next, freelist, lru) { - unsigned long pfn = page_to_pfn(page); - list_del(&page->lru); - __free_page(page); - if (pfn > high_pfn) - high_pfn = pfn; - } - - return high_pfn; + INIT_LIST_HEAD(&p->pages); + p->nr_pages = 0; } -static void split_map_pages(struct list_head *list) +static void split_map_pages(struct page_list *freepages) { - unsigned int i, order, nr_pages; + unsigned int i, order, total_nr_pages; struct page *page, *next; LIST_HEAD(tmp_list); - list_for_each_entry_safe(page, next, list, lru) { - list_del(&page->lru); + for (order = 0; order < NR_PAGE_ORDERS; order++) { + total_nr_pages = freepages[order].nr_pages * (1 << order); + freepages[order].nr_pages = 0; + + list_for_each_entry_safe(page, next, &freepages[order].pages, lru) { + unsigned int nr_pages; + + list_del(&page->lru); - order = page_private(page); - nr_pages = 1 << order; + nr_pages = 1 << order; - post_alloc_hook(page, order, __GFP_MOVABLE); - if (order) - split_page(page, order); + post_alloc_hook(page, order, __GFP_MOVABLE); + if (order) + split_page(page, order); - for (i = 0; i < nr_pages; i++) { - list_add(&page->lru, &tmp_list); - page++; + for (i = 0; i < nr_pages; i++) { + list_add(&page->lru, &tmp_list); + page++; + } } + freepages[0].nr_pages += total_nr_pages; + list_splice_init(&tmp_list, &freepages[0].pages); } +} - list_splice(&tmp_list, list); +static unsigned long release_free_list(struct page_list *freepages) +{ + int order; + unsigned long high_pfn = 0; + + for (order = 0; order < NR_PAGE_ORDERS; order++) { + struct page *page, *next; + + list_for_each_entry_safe(page, next, &freepages[order].pages, lru) { + unsigned long pfn = page_to_pfn(page); + + list_del(&page->lru); + /* + * Convert free pages into post allocation pages, so + * that we can free them via __free_page. + */ + post_alloc_hook(page, order, __GFP_MOVABLE); + __free_pages(page, order); + if (pfn > high_pfn) + high_pfn = pfn; + } + freepages[order].nr_pages = 0; + } + return high_pfn; } #ifdef CONFIG_COMPACTION @@ -583,7 +605,7 @@ static bool compact_unlock_should_abort(spinlock_t *lock, static unsigned long isolate_freepages_block(struct compact_control *cc, unsigned long *start_pfn, unsigned long end_pfn, - struct list_head *freelist, + struct page_list *freelist, unsigned int stride, bool strict) { @@ -657,7 +679,8 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, nr_scanned += isolated - 1; total_isolated += isolated; cc->nr_freepages += isolated; - list_add_tail(&page->lru, freelist); + list_add_tail(&page->lru, &freelist[order].pages); + freelist[order].nr_pages++; if (!strict && cc->nr_migratepages <= cc->nr_freepages) { blockpfn += isolated; @@ -722,7 +745,11 @@ isolate_freepages_range(struct compact_control *cc, unsigned long start_pfn, unsigned long end_pfn) { unsigned long isolated, pfn, block_start_pfn, block_end_pfn; - LIST_HEAD(freelist); + int order; + struct page_list tmp_freepages[NR_PAGE_ORDERS]; + + for (order = 0; order < NR_PAGE_ORDERS; order++) + init_page_list(&tmp_freepages[order]); pfn = start_pfn; block_start_pfn = pageblock_start_pfn(pfn); @@ -753,7 +780,7 @@ isolate_freepages_range(struct compact_control *cc, break; isolated = isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, &freelist, 0, true); + block_end_pfn, tmp_freepages, 0, true); /* * In strict mode, isolate_freepages_block() returns 0 if @@ -770,15 +797,15 @@ isolate_freepages_range(struct compact_control *cc, */ } - /* __isolate_free_page() does not map the pages */ - split_map_pages(&freelist); - if (pfn < end_pfn) { /* Loop terminated early, cleanup. */ - release_freepages(&freelist); + release_free_list(tmp_freepages); return 0; } + /* __isolate_free_page() does not map the pages */ + split_map_pages(tmp_freepages); + /* We don't use freelists for anything. */ return pfn; } @@ -1481,7 +1508,7 @@ fast_isolate_around(struct compact_control *cc, unsigned long pfn) if (!page) return; - isolate_freepages_block(cc, &start_pfn, end_pfn, &cc->freepages, 1, false); + isolate_freepages_block(cc, &start_pfn, end_pfn, cc->freepages, 1, false); /* Skip this pageblock in the future as it's full or nearly full */ if (start_pfn == end_pfn && !cc->no_set_skip_hint) @@ -1610,7 +1637,8 @@ static void fast_isolate_freepages(struct compact_control *cc) nr_scanned += nr_isolated - 1; total_isolated += nr_isolated; cc->nr_freepages += nr_isolated; - list_add_tail(&page->lru, &cc->freepages); + list_add_tail(&page->lru, &cc->freepages[order].pages); + cc->freepages[order].nr_pages++; count_compact_events(COMPACTISOLATED, nr_isolated); } else { /* If isolation fails, abort the search */ @@ -1687,13 +1715,12 @@ static void isolate_freepages(struct compact_control *cc) unsigned long isolate_start_pfn; /* exact pfn we start at */ unsigned long block_end_pfn; /* end of current pageblock */ unsigned long low_pfn; /* lowest pfn scanner is able to scan */ - struct list_head *freelist = &cc->freepages; unsigned int stride; /* Try a small search of the free lists for a candidate */ fast_isolate_freepages(cc); if (cc->nr_freepages) - goto splitmap; + return; /* * Initialise the free scanner. The starting point is where we last @@ -1753,7 +1780,7 @@ static void isolate_freepages(struct compact_control *cc) /* Found a block suitable for isolating free pages from. */ nr_isolated = isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, freelist, stride, false); + block_end_pfn, cc->freepages, stride, false); /* Update the skip hint if the full pageblock was scanned */ if (isolate_start_pfn == block_end_pfn) @@ -1794,10 +1821,6 @@ static void isolate_freepages(struct compact_control *cc) * and the loop terminated due to isolate_start_pfn < low_pfn */ cc->free_pfn = isolate_start_pfn; - -splitmap: - /* __isolate_free_page() does not map the pages */ - split_map_pages(freelist); } /* @@ -1808,23 +1831,22 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) { struct compact_control *cc = (struct compact_control *)data; struct folio *dst; + int order = folio_order(src); - /* this makes migrate_pages() split the source page and retry */ - if (folio_test_large(src) > 0) - return NULL; - - if (list_empty(&cc->freepages)) { + if (!cc->freepages[order].nr_pages) { isolate_freepages(cc); - - if (list_empty(&cc->freepages)) + if (!cc->freepages[order].nr_pages) return NULL; } - dst = list_entry(cc->freepages.next, struct folio, lru); + dst = list_first_entry(&cc->freepages[order].pages, struct folio, lru); + cc->freepages[order].nr_pages--; list_del(&dst->lru); - cc->nr_freepages--; - - return dst; + post_alloc_hook(&dst->page, order, __GFP_MOVABLE); + if (order) + prep_compound_page(&dst->page, order); + cc->nr_freepages -= 1 << order; + return page_rmappable_folio(&dst->page); } /* @@ -1835,9 +1857,17 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) static void compaction_free(struct folio *dst, unsigned long data) { struct compact_control *cc = (struct compact_control *)data; + int order = folio_order(dst); + struct page *page = &dst->page; + + folio_set_count(dst, 0); + free_pages_prepare_fpi_none(page, order); - list_add(&dst->lru, &cc->freepages); - cc->nr_freepages++; + INIT_LIST_HEAD(&dst->lru); + + list_add(&dst->lru, &cc->freepages[order].pages); + cc->freepages[order].nr_pages++; + cc->nr_freepages += 1 << order; } /* possible outcome of isolate_migratepages */ @@ -2461,6 +2491,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc) const bool sync = cc->mode != MIGRATE_ASYNC; bool update_cached; unsigned int nr_succeeded = 0; + int order; /* * These counters track activities during zone compaction. Initialize @@ -2470,7 +2501,8 @@ compact_zone(struct compact_control *cc, struct capture_control *capc) cc->total_free_scanned = 0; cc->nr_migratepages = 0; cc->nr_freepages = 0; - INIT_LIST_HEAD(&cc->freepages); + for (order = 0; order < NR_PAGE_ORDERS; order++) + init_page_list(&cc->freepages[order]); INIT_LIST_HEAD(&cc->migratepages); cc->migratetype = gfp_migratetype(cc->gfp_mask); @@ -2656,7 +2688,7 @@ compact_zone(struct compact_control *cc, struct capture_control *capc) * so we don't leave any returned pages behind in the next attempt. */ if (cc->nr_freepages > 0) { - unsigned long free_pfn = release_freepages(&cc->freepages); + unsigned long free_pfn = release_free_list(cc->freepages); cc->nr_freepages = 0; VM_BUG_ON(free_pfn == 0); @@ -2675,7 +2707,6 @@ compact_zone(struct compact_control *cc, struct capture_control *capc) trace_mm_compaction_end(cc, start_pfn, end_pfn, sync, ret); - VM_BUG_ON(!list_empty(&cc->freepages)); VM_BUG_ON(!list_empty(&cc->migratepages)); return ret; diff --git a/mm/internal.h b/mm/internal.h index 1e29c5821a1d..c6ea449c5353 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -447,6 +447,8 @@ extern void prep_compound_page(struct page *page, unsigned int order); extern void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags); +extern bool free_pages_prepare_fpi_none(struct page *page, unsigned int order); + extern int user_min_free_kbytes; extern void free_unref_page(struct page *page, unsigned int order); @@ -473,6 +475,11 @@ int split_free_page(struct page *free_page, /* * in mm/compaction.c */ + +struct page_list { + struct list_head pages; + unsigned long nr_pages; +}; /* * compact_control is used to track pages being migrated and the free pages * they are being migrated to during memory compaction. The free_pfn starts @@ -481,7 +488,7 @@ int split_free_page(struct page *free_page, * completes when free_pfn <= migrate_pfn */ struct compact_control { - struct list_head freepages; /* List of free pages to migrate to */ + struct page_list freepages[NR_PAGE_ORDERS]; /* List of free pages to migrate to */ struct list_head migratepages; /* List of pages being migrated */ unsigned int nr_freepages; /* Number of isolated free pages */ unsigned int nr_migratepages; /* Number of pages to migrate */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5be4cd8f6b5a..c7c135e6d5ee 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1179,6 +1179,12 @@ static __always_inline bool free_pages_prepare(struct page *page, return true; } +__always_inline bool free_pages_prepare_fpi_none(struct page *page, + unsigned int order) +{ + return free_pages_prepare(page, order, FPI_NONE); +} + /* * Frees a number of pages from the PCP lists * Assumes all pages on list are in same zone. -- 2.43.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-02 16:15 ` [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction Zi Yan @ 2024-02-09 16:37 ` Vlastimil Babka 2024-02-09 19:36 ` Zi Yan 2024-02-09 21:58 ` Zi Yan 0 siblings, 2 replies; 21+ messages in thread From: Vlastimil Babka @ 2024-02-09 16:37 UTC (permalink / raw) To: Zi Yan, linux-mm, linux-kernel Cc: Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/2/24 17:15, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > Before last commit, memory compaction only migrates order-0 folios and > skips >0 order folios. Last commit splits all >0 order folios during > compaction. This commit migrates >0 order folios during compaction by > keeping isolated free pages at their original size without splitting them > into order-0 pages and using them directly during migration process. > > What is different from the prior implementation: > 1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page > lists, where each page list stores free pages in the same order. > 2. All free pages are not post_alloc_hook() processed nor buddy pages, > although their orders are stored in first page's private like buddy > pages. > 3. During migration, in new page allocation time (i.e., in > compaction_alloc()), free pages are then processed by post_alloc_hook(). > When migration fails and a new page is returned (i.e., in > compaction_free()), free pages are restored by reversing the > post_alloc_hook() operations using newly added > free_pages_prepare_fpi_none(). > > Step 3 is done for a latter optimization that splitting and/or merging free > pages during compaction becomes easier. > > Note: without splitting free pages, compaction can end prematurely due to > migration will return -ENOMEM even if there is free pages. This happens > when no order-0 free page exist and compaction_alloc() return NULL. > > Signed-off-by: Zi Yan <ziy@nvidia.com> ... > /* > @@ -1835,9 +1857,17 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) > static void compaction_free(struct folio *dst, unsigned long data) > { > struct compact_control *cc = (struct compact_control *)data; > + int order = folio_order(dst); > + struct page *page = &dst->page; > + > + folio_set_count(dst, 0); We can't change refcount to 0 like this, after it was already set to 1 and somebody else might have done get_page_unless_zero(). You need to either put_page_testzero() and if it's false, consider the page lost, or leave it refcounted and adjust the code to handle both refcounted and non-refcounted pages on the lists (the first option is simpler and shouldn't be too bad). Perhaps folio_set_count()/set_page_count() should get some comment warning against this kind of mistake. > + free_pages_prepare_fpi_none(page, order); > > - list_add(&dst->lru, &cc->freepages); > - cc->nr_freepages++; > + INIT_LIST_HEAD(&dst->lru); > + > + list_add(&dst->lru, &cc->freepages[order].pages); > + cc->freepages[order].nr_pages++; > + cc->nr_freepages += 1 << order; > } > ... > > extern void free_unref_page(struct page *page, unsigned int order); > @@ -473,6 +475,11 @@ int split_free_page(struct page *free_page, > /* > * in mm/compaction.c > */ > + > +struct page_list { > + struct list_head pages; > + unsigned long nr_pages; I've checked and even with patch 3/3 I don't think you actually need the counter? The only check of the counter I noticed was to check for zero/non-zero, and you could use list_empty() instead. > +}; > /* > * compact_control is used to track pages being migrated and the free pages > * they are being migrated to during memory compaction. The free_pfn starts > @@ -481,7 +488,7 @@ int split_free_page(struct page *free_page, > * completes when free_pfn <= migrate_pfn > */ > struct compact_control { > - struct list_head freepages; /* List of free pages to migrate to */ > + struct page_list freepages[NR_PAGE_ORDERS]; /* List of free pages to migrate to */ > struct list_head migratepages; /* List of pages being migrated */ > unsigned int nr_freepages; /* Number of isolated free pages */ > unsigned int nr_migratepages; /* Number of pages to migrate */ > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 5be4cd8f6b5a..c7c135e6d5ee 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1179,6 +1179,12 @@ static __always_inline bool free_pages_prepare(struct page *page, > return true; > } > > +__always_inline bool free_pages_prepare_fpi_none(struct page *page, > + unsigned int order) > +{ > + return free_pages_prepare(page, order, FPI_NONE); > +} > + > /* > * Frees a number of pages from the PCP lists > * Assumes all pages on list are in same zone. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-09 16:37 ` Vlastimil Babka @ 2024-02-09 19:36 ` Zi Yan 2024-02-09 19:40 ` Zi Yan 2024-02-09 21:58 ` Zi Yan 1 sibling, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-09 19:36 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 4717 bytes --] On 9 Feb 2024, at 11:37, Vlastimil Babka wrote: > On 2/2/24 17:15, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> Before last commit, memory compaction only migrates order-0 folios and >> skips >0 order folios. Last commit splits all >0 order folios during >> compaction. This commit migrates >0 order folios during compaction by >> keeping isolated free pages at their original size without splitting them >> into order-0 pages and using them directly during migration process. >> >> What is different from the prior implementation: >> 1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page >> lists, where each page list stores free pages in the same order. >> 2. All free pages are not post_alloc_hook() processed nor buddy pages, >> although their orders are stored in first page's private like buddy >> pages. >> 3. During migration, in new page allocation time (i.e., in >> compaction_alloc()), free pages are then processed by post_alloc_hook(). >> When migration fails and a new page is returned (i.e., in >> compaction_free()), free pages are restored by reversing the >> post_alloc_hook() operations using newly added >> free_pages_prepare_fpi_none(). >> >> Step 3 is done for a latter optimization that splitting and/or merging free >> pages during compaction becomes easier. >> >> Note: without splitting free pages, compaction can end prematurely due to >> migration will return -ENOMEM even if there is free pages. This happens >> when no order-0 free page exist and compaction_alloc() return NULL. >> >> Signed-off-by: Zi Yan <ziy@nvidia.com> > > ... > >> /* >> @@ -1835,9 +1857,17 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >> static void compaction_free(struct folio *dst, unsigned long data) >> { >> struct compact_control *cc = (struct compact_control *)data; >> + int order = folio_order(dst); >> + struct page *page = &dst->page; >> + >> + folio_set_count(dst, 0); > > We can't change refcount to 0 like this, after it was already set to 1 and > somebody else might have done get_page_unless_zero(). You need to either > put_page_testzero() and if it's false, consider the page lost, or leave it > refcounted and adjust the code to handle both refcounted and non-refcounted > pages on the lists (the first option is simpler and shouldn't be too bad). Got it. Will fix it with the first option. Thanks. > > Perhaps folio_set_count()/set_page_count() should get some comment warning > against this kind of mistake. > >> + free_pages_prepare_fpi_none(page, order); >> >> - list_add(&dst->lru, &cc->freepages); >> - cc->nr_freepages++; >> + INIT_LIST_HEAD(&dst->lru); >> + >> + list_add(&dst->lru, &cc->freepages[order].pages); >> + cc->freepages[order].nr_pages++; >> + cc->nr_freepages += 1 << order; >> } >> > > ... > >> >> extern void free_unref_page(struct page *page, unsigned int order); >> @@ -473,6 +475,11 @@ int split_free_page(struct page *free_page, >> /* >> * in mm/compaction.c >> */ >> + >> +struct page_list { >> + struct list_head pages; >> + unsigned long nr_pages; > > I've checked and even with patch 3/3 I don't think you actually need the > counter? The only check of the counter I noticed was to check for > zero/non-zero, and you could use list_empty() instead. Sure. I will remove nr_pages. > >> +}; >> /* >> * compact_control is used to track pages being migrated and the free pages >> * they are being migrated to during memory compaction. The free_pfn starts >> @@ -481,7 +488,7 @@ int split_free_page(struct page *free_page, >> * completes when free_pfn <= migrate_pfn >> */ >> struct compact_control { >> - struct list_head freepages; /* List of free pages to migrate to */ >> + struct page_list freepages[NR_PAGE_ORDERS]; /* List of free pages to migrate to */ >> struct list_head migratepages; /* List of pages being migrated */ >> unsigned int nr_freepages; /* Number of isolated free pages */ >> unsigned int nr_migratepages; /* Number of pages to migrate */ >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 5be4cd8f6b5a..c7c135e6d5ee 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1179,6 +1179,12 @@ static __always_inline bool free_pages_prepare(struct page *page, >> return true; >> } >> >> +__always_inline bool free_pages_prepare_fpi_none(struct page *page, >> + unsigned int order) >> +{ >> + return free_pages_prepare(page, order, FPI_NONE); >> +} >> + >> /* >> * Frees a number of pages from the PCP lists >> * Assumes all pages on list are in same zone. -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-09 19:36 ` Zi Yan @ 2024-02-09 19:40 ` Zi Yan 2024-02-09 20:46 ` Vlastimil Babka 0 siblings, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-09 19:40 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 2662 bytes --] On 9 Feb 2024, at 14:36, Zi Yan wrote: > On 9 Feb 2024, at 11:37, Vlastimil Babka wrote: > >> On 2/2/24 17:15, Zi Yan wrote: >>> From: Zi Yan <ziy@nvidia.com> >>> >>> Before last commit, memory compaction only migrates order-0 folios and >>> skips >0 order folios. Last commit splits all >0 order folios during >>> compaction. This commit migrates >0 order folios during compaction by >>> keeping isolated free pages at their original size without splitting them >>> into order-0 pages and using them directly during migration process. >>> >>> What is different from the prior implementation: >>> 1. All isolated free pages are kept in a NR_PAGE_ORDERS array of page >>> lists, where each page list stores free pages in the same order. >>> 2. All free pages are not post_alloc_hook() processed nor buddy pages, >>> although their orders are stored in first page's private like buddy >>> pages. >>> 3. During migration, in new page allocation time (i.e., in >>> compaction_alloc()), free pages are then processed by post_alloc_hook(). >>> When migration fails and a new page is returned (i.e., in >>> compaction_free()), free pages are restored by reversing the >>> post_alloc_hook() operations using newly added >>> free_pages_prepare_fpi_none(). >>> >>> Step 3 is done for a latter optimization that splitting and/or merging free >>> pages during compaction becomes easier. >>> >>> Note: without splitting free pages, compaction can end prematurely due to >>> migration will return -ENOMEM even if there is free pages. This happens >>> when no order-0 free page exist and compaction_alloc() return NULL. >>> >>> Signed-off-by: Zi Yan <ziy@nvidia.com> >> >> ... >> >>> /* >>> @@ -1835,9 +1857,17 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >>> static void compaction_free(struct folio *dst, unsigned long data) >>> { >>> struct compact_control *cc = (struct compact_control *)data; >>> + int order = folio_order(dst); >>> + struct page *page = &dst->page; >>> + >>> + folio_set_count(dst, 0); >> >> We can't change refcount to 0 like this, after it was already set to 1 and >> somebody else might have done get_page_unless_zero(). You need to either >> put_page_testzero() and if it's false, consider the page lost, or leave it >> refcounted and adjust the code to handle both refcounted and non-refcounted >> pages on the lists (the first option is simpler and shouldn't be too bad). > Got it. Will fix it with the first option. Thanks. Do you think we should have a WARN or WARN_ONCE if we lose a page here? -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-09 19:40 ` Zi Yan @ 2024-02-09 20:46 ` Vlastimil Babka 2024-02-09 20:47 ` Zi Yan 0 siblings, 1 reply; 21+ messages in thread From: Vlastimil Babka @ 2024-02-09 20:46 UTC (permalink / raw) To: Zi Yan Cc: linux-mm, linux-kernel, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/9/24 20:40, Zi Yan wrote: > On 9 Feb 2024, at 14:36, Zi Yan wrote: > >> On 9 Feb 2024, at 11:37, Vlastimil Babka wrote: >> >>> On 2/2/24 17:15, Zi Yan wrote: >>> >>> ... >>> >>>> /* >>>> @@ -1835,9 +1857,17 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >>>> static void compaction_free(struct folio *dst, unsigned long data) >>>> { >>>> struct compact_control *cc = (struct compact_control *)data; >>>> + int order = folio_order(dst); >>>> + struct page *page = &dst->page; >>>> + >>>> + folio_set_count(dst, 0); >>> >>> We can't change refcount to 0 like this, after it was already set to 1 and >>> somebody else might have done get_page_unless_zero(). You need to either >>> put_page_testzero() and if it's false, consider the page lost, or leave it >>> refcounted and adjust the code to handle both refcounted and non-refcounted >>> pages on the lists (the first option is simpler and shouldn't be too bad). >> Got it. Will fix it with the first option. Thanks. > > Do you think we should have a WARN or WARN_ONCE if we lose a page here? No, no WARN, it all happens legitimately. It's only our compaction losing the page - whoever would do the get_page_unless_zero() to inspect that page would then have to put_page() which will free it back to page allocator. > -- > Best Regards, > Yan, Zi ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-09 20:46 ` Vlastimil Babka @ 2024-02-09 20:47 ` Zi Yan 0 siblings, 0 replies; 21+ messages in thread From: Zi Yan @ 2024-02-09 20:47 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 1495 bytes --] On 9 Feb 2024, at 15:46, Vlastimil Babka wrote: > On 2/9/24 20:40, Zi Yan wrote: >> On 9 Feb 2024, at 14:36, Zi Yan wrote: >> >>> On 9 Feb 2024, at 11:37, Vlastimil Babka wrote: >>> >>>> On 2/2/24 17:15, Zi Yan wrote: >>>> >>>> ... >>>> >>>>> /* >>>>> @@ -1835,9 +1857,17 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >>>>> static void compaction_free(struct folio *dst, unsigned long data) >>>>> { >>>>> struct compact_control *cc = (struct compact_control *)data; >>>>> + int order = folio_order(dst); >>>>> + struct page *page = &dst->page; >>>>> + >>>>> + folio_set_count(dst, 0); >>>> >>>> We can't change refcount to 0 like this, after it was already set to 1 and >>>> somebody else might have done get_page_unless_zero(). You need to either >>>> put_page_testzero() and if it's false, consider the page lost, or leave it >>>> refcounted and adjust the code to handle both refcounted and non-refcounted >>>> pages on the lists (the first option is simpler and shouldn't be too bad). >>> Got it. Will fix it with the first option. Thanks. >> >> Do you think we should have a WARN or WARN_ONCE if we lose a page here? > > No, no WARN, it all happens legitimately. It's only our compaction losing > the page - whoever would do the get_page_unless_zero() to inspect that page > would then have to put_page() which will free it back to page allocator. Got it. Thanks for the explanation. -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction. 2024-02-09 16:37 ` Vlastimil Babka 2024-02-09 19:36 ` Zi Yan @ 2024-02-09 21:58 ` Zi Yan 1 sibling, 0 replies; 21+ messages in thread From: Zi Yan @ 2024-02-09 21:58 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 697 bytes --] On 9 Feb 2024, at 11:37, Vlastimil Babka wrote: >> >> extern void free_unref_page(struct page *page, unsigned int order); >> @@ -473,6 +475,11 @@ int split_free_page(struct page *free_page, >> /* >> * in mm/compaction.c >> */ >> + >> +struct page_list { >> + struct list_head pages; >> + unsigned long nr_pages; > > I've checked and even with patch 3/3 I don't think you actually need the > counter? The only check of the counter I noticed was to check for > zero/non-zero, and you could use list_empty() instead. Should I just use struct list_head instead of a new struct page_list? list_empty(&cc->freepages[order]) vs list_empty(&cc->freepages[order].pages). -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split. 2024-02-02 16:15 [PATCH v3 0/3] Enable >0 order folio memory compaction Zi Yan 2024-02-02 16:15 ` [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios Zi Yan 2024-02-02 16:15 ` [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction Zi Yan @ 2024-02-02 16:15 ` Zi Yan 2024-02-09 18:43 ` Vlastimil Babka 2024-02-02 19:55 ` [PATCH v3 0/3] Enable >0 order folio memory compaction Luis Chamberlain 2024-02-05 8:16 ` Baolin Wang 4 siblings, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-02 16:15 UTC (permalink / raw) To: linux-mm, linux-kernel Cc: Zi Yan, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Vlastimil Babka, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) From: Zi Yan <ziy@nvidia.com> During migration in a memory compaction, free pages are placed in an array of page lists based on their order. But the desired free page order (i.e., the order of a source page) might not be always present, thus leading to migration failures and premature compaction termination. Split a high order free pages when source migration page has a lower order to increase migration successful rate. Note: merging free pages when a migration fails and a lower order free page is returned via compaction_free() is possible, but there is too much work. Since the free pages are not buddy pages, it is hard to identify these free pages using existing PFN-based page merging algorithm. Signed-off-by: Zi Yan <ziy@nvidia.com> --- mm/compaction.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/mm/compaction.c b/mm/compaction.c index 58a4e3fb72ec..fa9993c8a389 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1832,9 +1832,43 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) struct compact_control *cc = (struct compact_control *)data; struct folio *dst; int order = folio_order(src); + bool has_isolated_pages = false; +again: if (!cc->freepages[order].nr_pages) { - isolate_freepages(cc); + int i; + + for (i = order + 1; i < NR_PAGE_ORDERS; i++) { + if (cc->freepages[i].nr_pages) { + struct page *freepage = + list_first_entry(&cc->freepages[i].pages, + struct page, lru); + + int start_order = i; + unsigned long size = 1 << start_order; + + list_del(&freepage->lru); + cc->freepages[i].nr_pages--; + + while (start_order > order) { + start_order--; + size >>= 1; + + list_add(&freepage[size].lru, + &cc->freepages[start_order].pages); + cc->freepages[start_order].nr_pages++; + set_page_private(&freepage[size], start_order); + } + dst = (struct folio *)freepage; + goto done; + } + } + if (!has_isolated_pages) { + isolate_freepages(cc); + has_isolated_pages = true; + goto again; + } + if (!cc->freepages[order].nr_pages) return NULL; } @@ -1842,6 +1876,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) dst = list_first_entry(&cc->freepages[order].pages, struct folio, lru); cc->freepages[order].nr_pages--; list_del(&dst->lru); +done: post_alloc_hook(&dst->page, order, __GFP_MOVABLE); if (order) prep_compound_page(&dst->page, order); -- 2.43.0 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split. 2024-02-02 16:15 ` [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan @ 2024-02-09 18:43 ` Vlastimil Babka 2024-02-09 19:57 ` Zi Yan 0 siblings, 1 reply; 21+ messages in thread From: Vlastimil Babka @ 2024-02-09 18:43 UTC (permalink / raw) To: Zi Yan, linux-mm, linux-kernel Cc: Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/2/24 17:15, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > During migration in a memory compaction, free pages are placed in an array > of page lists based on their order. But the desired free page order (i.e., > the order of a source page) might not be always present, thus leading to > migration failures and premature compaction termination. Split a high > order free pages when source migration page has a lower order to increase > migration successful rate. > > Note: merging free pages when a migration fails and a lower order free > page is returned via compaction_free() is possible, but there is too much > work. Since the free pages are not buddy pages, it is hard to identify > these free pages using existing PFN-based page merging algorithm. > > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- > mm/compaction.c | 37 ++++++++++++++++++++++++++++++++++++- > 1 file changed, 36 insertions(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 58a4e3fb72ec..fa9993c8a389 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -1832,9 +1832,43 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) > struct compact_control *cc = (struct compact_control *)data; > struct folio *dst; > int order = folio_order(src); > + bool has_isolated_pages = false; > > +again: > if (!cc->freepages[order].nr_pages) { > - isolate_freepages(cc); > + int i; > + > + for (i = order + 1; i < NR_PAGE_ORDERS; i++) { You could probably just start with a loop that finds the start_order (and do the isolate_freepages() attempt if there's none) and then handle the rest outside of the loop. No need to separately handle the case where you have the exact order available? > + if (cc->freepages[i].nr_pages) { > + struct page *freepage = > + list_first_entry(&cc->freepages[i].pages, > + struct page, lru); > + > + int start_order = i; > + unsigned long size = 1 << start_order; > + > + list_del(&freepage->lru); > + cc->freepages[i].nr_pages--; > + > + while (start_order > order) { With exact order available this while loop will just be skipped and that's all the difference to it? > + start_order--; > + size >>= 1; > + > + list_add(&freepage[size].lru, > + &cc->freepages[start_order].pages); > + cc->freepages[start_order].nr_pages++; > + set_page_private(&freepage[size], start_order); > + } > + dst = (struct folio *)freepage; > + goto done; > + } > + } > + if (!has_isolated_pages) { > + isolate_freepages(cc); > + has_isolated_pages = true; > + goto again; > + } > + > if (!cc->freepages[order].nr_pages) > return NULL; > } > @@ -1842,6 +1876,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) > dst = list_first_entry(&cc->freepages[order].pages, struct folio, lru); > cc->freepages[order].nr_pages--; > list_del(&dst->lru); > +done: > post_alloc_hook(&dst->page, order, __GFP_MOVABLE); > if (order) > prep_compound_page(&dst->page, order); ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split. 2024-02-09 18:43 ` Vlastimil Babka @ 2024-02-09 19:57 ` Zi Yan 2024-02-09 20:49 ` Vlastimil Babka 0 siblings, 1 reply; 21+ messages in thread From: Zi Yan @ 2024-02-09 19:57 UTC (permalink / raw) To: Vlastimil Babka Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 4535 bytes --] On 9 Feb 2024, at 13:43, Vlastimil Babka wrote: > On 2/2/24 17:15, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> During migration in a memory compaction, free pages are placed in an array >> of page lists based on their order. But the desired free page order (i.e., >> the order of a source page) might not be always present, thus leading to >> migration failures and premature compaction termination. Split a high >> order free pages when source migration page has a lower order to increase >> migration successful rate. >> >> Note: merging free pages when a migration fails and a lower order free >> page is returned via compaction_free() is possible, but there is too much >> work. Since the free pages are not buddy pages, it is hard to identify >> these free pages using existing PFN-based page merging algorithm. >> >> Signed-off-by: Zi Yan <ziy@nvidia.com> >> --- >> mm/compaction.c | 37 ++++++++++++++++++++++++++++++++++++- >> 1 file changed, 36 insertions(+), 1 deletion(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index 58a4e3fb72ec..fa9993c8a389 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -1832,9 +1832,43 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >> struct compact_control *cc = (struct compact_control *)data; >> struct folio *dst; >> int order = folio_order(src); >> + bool has_isolated_pages = false; >> >> +again: >> if (!cc->freepages[order].nr_pages) { >> - isolate_freepages(cc); >> + int i; >> + >> + for (i = order + 1; i < NR_PAGE_ORDERS; i++) { > > You could probably just start with a loop that finds the start_order (and do > the isolate_freepages() attempt if there's none) and then handle the rest > outside of the loop. No need to separately handle the case where you have > the exact order available? Like this? if (list_empty(&cc->freepages[order].pages)) { int start_order; for (start_order = order + 1; start_order < NR_PAGE_ORDERS; start_order++) if (!list_empty(&cc->freepages[start_order].pages)) break; /* no free pages in the list */ if (start_order == NR_PAGE_ORDERS) { if (!has_isolated_pages) { isolate_freepages(cc); has_isolated_pages = true; goto again; } else return NULL; } struct page *freepage = list_first_entry(&cc->freepages[start_order].pages, struct page, lru); unsigned long size = 1 << start_order; list_del(&freepage->lru); while (start_order > order) { start_order--; size >>= 1; list_add(&freepage[size].lru, &cc->freepages[start_order].pages); set_page_private(&freepage[size], start_order); } dst = (struct folio *)freepage; goto done; } > >> + if (cc->freepages[i].nr_pages) { >> + struct page *freepage = >> + list_first_entry(&cc->freepages[i].pages, >> + struct page, lru); >> + >> + int start_order = i; >> + unsigned long size = 1 << start_order; >> + >> + list_del(&freepage->lru); >> + cc->freepages[i].nr_pages--; >> + >> + while (start_order > order) { > > With exact order available this while loop will just be skipped and that's > all the difference to it? > >> + start_order--; >> + size >>= 1; >> + >> + list_add(&freepage[size].lru, >> + &cc->freepages[start_order].pages); >> + cc->freepages[start_order].nr_pages++; >> + set_page_private(&freepage[size], start_order); >> + } >> + dst = (struct folio *)freepage; >> + goto done; >> + } >> + } >> + if (!has_isolated_pages) { >> + isolate_freepages(cc); >> + has_isolated_pages = true; >> + goto again; >> + } >> + >> if (!cc->freepages[order].nr_pages) >> return NULL; >> } >> @@ -1842,6 +1876,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >> dst = list_first_entry(&cc->freepages[order].pages, struct folio, lru); >> cc->freepages[order].nr_pages--; >> list_del(&dst->lru); >> +done: >> post_alloc_hook(&dst->page, order, __GFP_MOVABLE); >> if (order) >> prep_compound_page(&dst->page, order); -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split. 2024-02-09 19:57 ` Zi Yan @ 2024-02-09 20:49 ` Vlastimil Babka 0 siblings, 0 replies; 21+ messages in thread From: Vlastimil Babka @ 2024-02-09 20:49 UTC (permalink / raw) To: Zi Yan Cc: linux-mm, linux-kernel, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/9/24 20:57, Zi Yan wrote: > On 9 Feb 2024, at 13:43, Vlastimil Babka wrote: > >> On 2/2/24 17:15, Zi Yan wrote: >>> From: Zi Yan <ziy@nvidia.com> >>> >>> During migration in a memory compaction, free pages are placed in an array >>> of page lists based on their order. But the desired free page order (i.e., >>> the order of a source page) might not be always present, thus leading to >>> migration failures and premature compaction termination. Split a high >>> order free pages when source migration page has a lower order to increase >>> migration successful rate. >>> >>> Note: merging free pages when a migration fails and a lower order free >>> page is returned via compaction_free() is possible, but there is too much >>> work. Since the free pages are not buddy pages, it is hard to identify >>> these free pages using existing PFN-based page merging algorithm. >>> >>> Signed-off-by: Zi Yan <ziy@nvidia.com> >>> --- >>> mm/compaction.c | 37 ++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 36 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index 58a4e3fb72ec..fa9993c8a389 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -1832,9 +1832,43 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >>> struct compact_control *cc = (struct compact_control *)data; >>> struct folio *dst; >>> int order = folio_order(src); >>> + bool has_isolated_pages = false; >>> >>> +again: >>> if (!cc->freepages[order].nr_pages) { >>> - isolate_freepages(cc); >>> + int i; >>> + >>> + for (i = order + 1; i < NR_PAGE_ORDERS; i++) { >> >> You could probably just start with a loop that finds the start_order (and do >> the isolate_freepages() attempt if there's none) and then handle the rest >> outside of the loop. No need to separately handle the case where you have >> the exact order available? > Like this? Almost/ > if (list_empty(&cc->freepages[order].pages)) { You don't need to do that under that if (). > int start_order; > > for (start_order = order + 1; start_order < NR_PAGE_ORDERS; Just do start_order = order; ... (not order + 1). The rest should just work. > start_order++) > if (!list_empty(&cc->freepages[start_order].pages)) > break; > > /* no free pages in the list */ > if (start_order == NR_PAGE_ORDERS) { > if (!has_isolated_pages) { > isolate_freepages(cc); > has_isolated_pages = true; > goto again; > } else > return NULL; > } > > struct page *freepage = > list_first_entry(&cc->freepages[start_order].pages, > struct page, lru); > > unsigned long size = 1 << start_order; > > list_del(&freepage->lru); > > while (start_order > order) { > start_order--; > size >>= 1; > > list_add(&freepage[size].lru, > &cc->freepages[start_order].pages); > set_page_private(&freepage[size], start_order); > } > dst = (struct folio *)freepage; > goto done; > } > >> >>> + if (cc->freepages[i].nr_pages) { >>> + struct page *freepage = >>> + list_first_entry(&cc->freepages[i].pages, >>> + struct page, lru); >>> + >>> + int start_order = i; >>> + unsigned long size = 1 << start_order; >>> + >>> + list_del(&freepage->lru); >>> + cc->freepages[i].nr_pages--; >>> + >>> + while (start_order > order) { >> >> With exact order available this while loop will just be skipped and that's >> all the difference to it? >> >>> + start_order--; >>> + size >>= 1; >>> + >>> + list_add(&freepage[size].lru, >>> + &cc->freepages[start_order].pages); >>> + cc->freepages[start_order].nr_pages++; >>> + set_page_private(&freepage[size], start_order); >>> + } >>> + dst = (struct folio *)freepage; >>> + goto done; >>> + } >>> + } >>> + if (!has_isolated_pages) { >>> + isolate_freepages(cc); >>> + has_isolated_pages = true; >>> + goto again; >>> + } >>> + >>> if (!cc->freepages[order].nr_pages) >>> return NULL; >>> } >>> @@ -1842,6 +1876,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >>> dst = list_first_entry(&cc->freepages[order].pages, struct folio, lru); >>> cc->freepages[order].nr_pages--; >>> list_del(&dst->lru); >>> +done: >>> post_alloc_hook(&dst->page, order, __GFP_MOVABLE); >>> if (order) >>> prep_compound_page(&dst->page, order); > > > -- > Best Regards, > Yan, Zi ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 0/3] Enable >0 order folio memory compaction 2024-02-02 16:15 [PATCH v3 0/3] Enable >0 order folio memory compaction Zi Yan ` (2 preceding siblings ...) 2024-02-02 16:15 ` [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan @ 2024-02-02 19:55 ` Luis Chamberlain 2024-02-02 20:12 ` Zi Yan 2024-02-05 8:16 ` Baolin Wang 4 siblings, 1 reply; 21+ messages in thread From: Luis Chamberlain @ 2024-02-02 19:55 UTC (permalink / raw) To: Zi Yan Cc: linux-mm, linux-kernel, Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Vlastimil Babka, Kirill A . Shutemov, Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Adam Manzanares, Vishal Moola (Oracle) On Fri, Feb 02, 2024 at 11:15:51AM -0500, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > Hi all, > > This patchset enables >0 order folio memory compaction, which is one of > the prerequisitions for large folio support[1]. > > [1] https://lore.kernel.org/linux-mm/20230912162815.440749-1-zi.yan@sent.com/ This URL started being referenced to your patch series instead of the rationale as to why this is important, and that is that compaction today skips pages with order > 0 and that this is already a problem for the page cache. The correct URL which you had in your *first* cover letter is: https://lore.kernel.org/linux-mm/f8d47176-03a8-99bf-a813-b5942830fd73@arm.com/ Luis ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 0/3] Enable >0 order folio memory compaction 2024-02-02 19:55 ` [PATCH v3 0/3] Enable >0 order folio memory compaction Luis Chamberlain @ 2024-02-02 20:12 ` Zi Yan 0 siblings, 0 replies; 21+ messages in thread From: Zi Yan @ 2024-02-02 20:12 UTC (permalink / raw) To: Luis Chamberlain Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, Vlastimil Babka, "Kirill A . Shutemov", Johannes Weiner, Baolin Wang, Kemeng Shi, Mel Gorman, Rohan Puri, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 822 bytes --] On 2 Feb 2024, at 14:55, Luis Chamberlain wrote: > On Fri, Feb 02, 2024 at 11:15:51AM -0500, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> Hi all, >> >> This patchset enables >0 order folio memory compaction, which is one of >> the prerequisitions for large folio support[1]. >> >> [1] https://lore.kernel.org/linux-mm/20230912162815.440749-1-zi.yan@sent.com/ > > This URL started being referenced to your patch series instead of the > rationale as to why this is important, and that is that compaction today > skips pages with order > 0 and that this is already a problem for the > page cache. The correct URL which you had in your *first* cover letter > is: > > https://lore.kernel.org/linux-mm/f8d47176-03a8-99bf-a813-b5942830fd73@arm.com/ You are right. Thank you for correcting it. -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 0/3] Enable >0 order folio memory compaction 2024-02-02 16:15 [PATCH v3 0/3] Enable >0 order folio memory compaction Zi Yan ` (3 preceding siblings ...) 2024-02-02 19:55 ` [PATCH v3 0/3] Enable >0 order folio memory compaction Luis Chamberlain @ 2024-02-05 8:16 ` Baolin Wang 2024-02-05 14:18 ` Zi Yan 4 siblings, 1 reply; 21+ messages in thread From: Baolin Wang @ 2024-02-05 8:16 UTC (permalink / raw) To: Zi Yan, linux-mm, linux-kernel Cc: Huang, Ying, Ryan Roberts, Andrew Morton, Matthew Wilcox (Oracle), David Hildenbrand, Yin, Fengwei, Yu Zhao, Vlastimil Babka, Kirill A . Shutemov, Johannes Weiner, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, Vishal Moola (Oracle) On 2/3/2024 12:15 AM, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > Hi all, > > This patchset enables >0 order folio memory compaction, which is one of > the prerequisitions for large folio support[1]. It includes the fix[4] for > V2 and is on top of mm-everything-2024-01-29-07-19. > > I am aware of that split free pages is necessary for folio > migration in compaction, since if >0 order free pages are never split > and no order-0 free page is scanned, compaction will end prematurely due > to migration returns -ENOMEM. Free page split becomes a must instead of > an optimization. > > lkp ncompare results for default LRU (-no-mglru) and CONFIG_LRU_GEN are > shown at the bottom (on a 8-CPU (Intel Xeon E5-2650 v4 @ 2.20GHz) 16G VM). > In sum, most of vm-scalability applications do not see performance change, > and the others see ~4% to ~26% performance boost under default LRU and > ~2% to ~6% performance boost under CONFIG_LRU_GEN. For the whole series, looks good to me. And I did not find any regression after running thpcompact. So feel free to add: Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v3 0/3] Enable >0 order folio memory compaction 2024-02-05 8:16 ` Baolin Wang @ 2024-02-05 14:18 ` Zi Yan 0 siblings, 0 replies; 21+ messages in thread From: Zi Yan @ 2024-02-05 14:18 UTC (permalink / raw) To: Baolin Wang Cc: linux-mm, linux-kernel, "Huang, Ying", Ryan Roberts, Andrew Morton, "Matthew Wilcox (Oracle)", David Hildenbrand, "Yin, Fengwei", Yu Zhao, Vlastimil Babka, "Kirill A . Shutemov", Johannes Weiner, Kemeng Shi, Mel Gorman, Rohan Puri, Mcgrof Chamberlain, Adam Manzanares, "Vishal Moola (Oracle)" [-- Attachment #1: Type: text/plain, Size: 1349 bytes --] On 5 Feb 2024, at 3:16, Baolin Wang wrote: > On 2/3/2024 12:15 AM, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> Hi all, >> >> This patchset enables >0 order folio memory compaction, which is one of >> the prerequisitions for large folio support[1]. It includes the fix[4] for >> V2 and is on top of mm-everything-2024-01-29-07-19. >> >> I am aware of that split free pages is necessary for folio >> migration in compaction, since if >0 order free pages are never split >> and no order-0 free page is scanned, compaction will end prematurely due >> to migration returns -ENOMEM. Free page split becomes a must instead of >> an optimization. >> >> lkp ncompare results for default LRU (-no-mglru) and CONFIG_LRU_GEN are >> shown at the bottom (on a 8-CPU (Intel Xeon E5-2650 v4 @ 2.20GHz) 16G VM). >> In sum, most of vm-scalability applications do not see performance change, >> and the others see ~4% to ~26% performance boost under default LRU and >> ~2% to ~6% performance boost under CONFIG_LRU_GEN. > > For the whole series, looks good to me. And I did not find any regression after running thpcompact. So feel free to add: > Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> > Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Thank you for the review and testing. -- Best Regards, Yan, Zi [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 854 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-02-09 21:59 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-02-02 16:15 [PATCH v3 0/3] Enable >0 order folio memory compaction Zi Yan 2024-02-02 16:15 ` [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios Zi Yan 2024-02-09 14:32 ` Vlastimil Babka 2024-02-09 19:25 ` Zi Yan 2024-02-09 20:43 ` Vlastimil Babka 2024-02-09 20:44 ` Zi Yan 2024-02-02 16:15 ` [PATCH v3 2/3] mm/compaction: add support for >0 order folio memory compaction Zi Yan 2024-02-09 16:37 ` Vlastimil Babka 2024-02-09 19:36 ` Zi Yan 2024-02-09 19:40 ` Zi Yan 2024-02-09 20:46 ` Vlastimil Babka 2024-02-09 20:47 ` Zi Yan 2024-02-09 21:58 ` Zi Yan 2024-02-02 16:15 ` [PATCH v3 3/3] mm/compaction: optimize >0 order folio compaction with free page split Zi Yan 2024-02-09 18:43 ` Vlastimil Babka 2024-02-09 19:57 ` Zi Yan 2024-02-09 20:49 ` Vlastimil Babka 2024-02-02 19:55 ` [PATCH v3 0/3] Enable >0 order folio memory compaction Luis Chamberlain 2024-02-02 20:12 ` Zi Yan 2024-02-05 8:16 ` Baolin Wang 2024-02-05 14:18 ` Zi Yan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).