* [patch -mm 1/2] mm/compaction: split freepages without holding the zone lock fix @ 2016-06-21 21:47 David Rientjes 2016-06-21 21:47 ` [patch -mm 2/2] mm, compaction: abort free scanner if split fails David Rientjes 2016-06-22 1:22 ` [patch] " David Rientjes 0 siblings, 2 replies; 13+ messages in thread From: David Rientjes @ 2016-06-21 21:47 UTC (permalink / raw) To: Andrew Morton, Vlastimil Babka, Minchan Kim, Joonsoo Kim Cc: Mel Gorman, Hugh Dickins, linux-kernel, linux-mm If __isolate_free_page() fails, avoid adding to freelist so we don't call map_pages() with it. Signed-off-by: David Rientjes <rientjes@google.com> --- Fix for mm-compaction-split-freepages-without-holding-the-zone-lock.patch in -mm. mm/compaction.c | 29 +++++++++++++---------------- 1 file changed, 13 insertions(+), 16 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c --- a/mm/compaction.c +++ b/mm/compaction.c @@ -494,24 +494,21 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, /* Found a free page, will break it into order-0 pages */ order = page_order(page); - isolated = __isolate_free_page(page, page_order(page)); + isolated = __isolate_free_page(page, order); + if (!isolated) + goto isolate_fail; set_page_private(page, order); total_isolated += isolated; list_add_tail(&page->lru, freelist); - - /* If a page was split, advance to the end of it */ - if (isolated) { - cc->nr_freepages += isolated; - if (!strict && - cc->nr_migratepages <= cc->nr_freepages) { - blockpfn += isolated; - break; - } - - blockpfn += isolated - 1; - cursor += isolated - 1; - continue; + cc->nr_freepages += isolated; + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { + blockpfn += isolated; + break; } + /* Advance to end of split page */ + blockpfn += isolated - 1; + cursor += isolated - 1; + continue; isolate_fail: if (strict) @@ -622,7 +619,7 @@ isolate_freepages_range(struct compact_control *cc, */ } - /* split_free_page does not map the pages */ + /* __isolate_free_page() does not map the pages */ map_pages(&freelist); if (pfn < end_pfn) { @@ -1124,7 +1121,7 @@ static void isolate_freepages(struct compact_control *cc) } } - /* split_free_page does not map the pages */ + /* __isolate_free_page() does not map the pages */ map_pages(freelist); /* ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch -mm 2/2] mm, compaction: abort free scanner if split fails 2016-06-21 21:47 [patch -mm 1/2] mm/compaction: split freepages without holding the zone lock fix David Rientjes @ 2016-06-21 21:47 ` David Rientjes 2016-06-22 1:22 ` [patch] " David Rientjes 1 sibling, 0 replies; 13+ messages in thread From: David Rientjes @ 2016-06-21 21:47 UTC (permalink / raw) To: Andrew Morton, Vlastimil Babka, Minchan Kim, Joonsoo Kim Cc: Mel Gorman, Hugh Dickins, linux-kernel, linux-mm If the memory compaction free scanner cannot successfully split a free page (only possible due to per-zone low watermark), terminate the free scanner rather than continuing to scan memory needlessly. If the per-zone watermark is insufficient for a free page of order <= cc->order, then terminate the scanner since future splits will also likely fail. This prevents the compaction freeing scanner from scanning all memory on very large zones (very noticeable for zones > 128GB, for instance) when all splits will likely fail. Signed-off-by: David Rientjes <rientjes@google.com> --- Note: I think we may want to backport this to -stable since this problem has existed since at least 3.11. This patch won't cleanly apply to any stable tree, though. If people think it should be backported, let me know and I'll handle the failures as they arise and rebase. mm/compaction.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c --- a/mm/compaction.c +++ b/mm/compaction.c @@ -496,7 +496,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, order = page_order(page); isolated = __isolate_free_page(page, order); if (!isolated) - goto isolate_fail; + break; set_page_private(page, order); total_isolated += isolated; list_add_tail(&page->lru, freelist); @@ -518,6 +518,9 @@ isolate_fail: } + if (locked) + spin_unlock_irqrestore(&cc->zone->lock, flags); + /* * There is a tiny chance that we have read bogus compound_order(), * so be careful to not go outside of the pageblock. @@ -539,9 +542,6 @@ isolate_fail: if (strict && blockpfn < end_pfn) total_isolated = 0; - if (locked) - spin_unlock_irqrestore(&cc->zone->lock, flags); - /* Update the pageblock-skip if the whole pageblock was scanned */ if (blockpfn == end_pfn) update_pageblock_skip(cc, valid_page, total_isolated, false); @@ -1068,6 +1068,7 @@ static void isolate_freepages(struct compact_control *cc) block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { + unsigned long isolated; /* * This can iterate a massively long zone without finding any @@ -1092,8 +1093,12 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, freelist, false); + isolated = isolate_freepages_block(cc, &isolate_start_pfn, + block_end_pfn, freelist, false); + /* If isolation failed, do not continue needlessly */ + if (!isolated && isolate_start_pfn < block_end_pfn && + cc->nr_freepages <= cc->nr_migratepages) + break; /* * If we isolated enough freepages, or aborted due to async ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch] mm, compaction: abort free scanner if split fails 2016-06-21 21:47 [patch -mm 1/2] mm/compaction: split freepages without holding the zone lock fix David Rientjes 2016-06-21 21:47 ` [patch -mm 2/2] mm, compaction: abort free scanner if split fails David Rientjes @ 2016-06-22 1:22 ` David Rientjes 2016-06-22 11:02 ` Vlastimil Babka 2016-06-22 21:56 ` Andrew Morton 1 sibling, 2 replies; 13+ messages in thread From: David Rientjes @ 2016-06-22 1:22 UTC (permalink / raw) To: Andrew Morton, Vlastimil Babka, Minchan Kim, Joonsoo Kim Cc: Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable If the memory compaction free scanner cannot successfully split a free page (only possible due to per-zone low watermark), terminate the free scanner rather than continuing to scan memory needlessly. If the watermark is insufficient for a free page of order <= cc->order, then terminate the scanner since all future splits will also likely fail. This prevents the compaction freeing scanner from scanning all memory on very large zones (very noticeable for zones > 128GB, for instance) when all splits will likely fail while holding zone->lock. Cc: stable@vger.kernel.org Signed-off-by: David Rientjes <rientjes@google.com> --- Based on Linus's tree Suggest including in 4.7 if anybody else agrees? mm/compaction.c | 39 +++++++++++++++++++++------------------ 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c --- a/mm/compaction.c +++ b/mm/compaction.c @@ -441,25 +441,23 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, /* Found a free page, break it into order-0 pages */ isolated = split_free_page(page); + if (!isolated) + break; + total_isolated += isolated; + cc->nr_freepages += isolated; for (i = 0; i < isolated; i++) { list_add(&page->lru, freelist); page++; } - - /* If a page was split, advance to the end of it */ - if (isolated) { - cc->nr_freepages += isolated; - if (!strict && - cc->nr_migratepages <= cc->nr_freepages) { - blockpfn += isolated; - break; - } - - blockpfn += isolated - 1; - cursor += isolated - 1; - continue; + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { + blockpfn += isolated; + break; } + /* Advance to the end of split page */ + blockpfn += isolated - 1; + cursor += isolated - 1; + continue; isolate_fail: if (strict) @@ -469,6 +467,9 @@ isolate_fail: } + if (locked) + spin_unlock_irqrestore(&cc->zone->lock, flags); + /* * There is a tiny chance that we have read bogus compound_order(), * so be careful to not go outside of the pageblock. @@ -490,9 +491,6 @@ isolate_fail: if (strict && blockpfn < end_pfn) total_isolated = 0; - if (locked) - spin_unlock_irqrestore(&cc->zone->lock, flags); - /* Update the pageblock-skip if the whole pageblock was scanned */ if (blockpfn == end_pfn) update_pageblock_skip(cc, valid_page, total_isolated, false); @@ -1011,6 +1009,7 @@ static void isolate_freepages(struct compact_control *cc) block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { + unsigned long isolated; /* * This can iterate a massively long zone without finding any @@ -1035,8 +1034,12 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, freelist, false); + isolated = isolate_freepages_block(cc, &isolate_start_pfn, + block_end_pfn, freelist, false); + /* If isolation failed early, do not continue needlessly */ + if (!isolated && isolate_start_pfn < block_end_pfn && + cc->nr_migratepages > cc->nr_freepages) + break; /* * If we isolated enough freepages, or aborted due to async ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 1:22 ` [patch] " David Rientjes @ 2016-06-22 11:02 ` Vlastimil Babka 2016-06-22 21:56 ` Andrew Morton 1 sibling, 0 replies; 13+ messages in thread From: Vlastimil Babka @ 2016-06-22 11:02 UTC (permalink / raw) To: David Rientjes, Andrew Morton, Minchan Kim, Joonsoo Kim Cc: Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On 06/22/2016 03:22 AM, David Rientjes wrote: > If the memory compaction free scanner cannot successfully split a free > page (only possible due to per-zone low watermark), terminate the free > scanner rather than continuing to scan memory needlessly. If the > watermark is insufficient for a free page of order <= cc->order, then > terminate the scanner since all future splits will also likely fail. > > This prevents the compaction freeing scanner from scanning all memory on > very large zones (very noticeable for zones > 128GB, for instance) when > all splits will likely fail while holding zone->lock. > > Cc: stable@vger.kernel.org > Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > Based on Linus's tree > > Suggest including in 4.7 if anybody else agrees? 4.7 definitely. Stable is less clear especially if you say it won't apply cleanly, but if you're ready to handle it, sure. The rules now allow fixing glaring performance bugs. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 1:22 ` [patch] " David Rientjes 2016-06-22 11:02 ` Vlastimil Babka @ 2016-06-22 21:56 ` Andrew Morton 2016-06-22 21:59 ` Andrew Morton 2016-06-22 22:06 ` David Rientjes 1 sibling, 2 replies; 13+ messages in thread From: Andrew Morton @ 2016-06-22 21:56 UTC (permalink / raw) To: David Rientjes Cc: Vlastimil Babka, Minchan Kim, Joonsoo Kim, Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On Tue, 21 Jun 2016 18:22:49 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > If the memory compaction free scanner cannot successfully split a free > page (only possible due to per-zone low watermark), terminate the free > scanner rather than continuing to scan memory needlessly. If the > watermark is insufficient for a free page of order <= cc->order, then > terminate the scanner since all future splits will also likely fail. > > This prevents the compaction freeing scanner from scanning all memory on > very large zones (very noticeable for zones > 128GB, for instance) when > all splits will likely fail while holding zone->lock. > This collides pretty heavily with Joonsoo's "mm/compaction: split freepages without holding the zone lock". I ended up with this, in isolate_freepages_block(): /* Found a free page, will break it into order-0 pages */ order = page_order(page); isolated = __isolate_free_page(page, page_order(page)); set_page_private(page, order); total_isolated += isolated; cc->nr_freepages += isolated; list_add_tail(&page->lru, freelist); if (!strict && cc->nr_migratepages <= cc->nr_freepages) { blockpfn += isolated; break; } /* Advance to the end of split page */ blockpfn += isolated - 1; cursor += isolated - 1; continue; isolate_fail: ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 21:56 ` Andrew Morton @ 2016-06-22 21:59 ` Andrew Morton 2016-06-22 23:40 ` David Rientjes 2016-06-22 22:06 ` David Rientjes 1 sibling, 1 reply; 13+ messages in thread From: Andrew Morton @ 2016-06-22 21:59 UTC (permalink / raw) To: David Rientjes, Vlastimil Babka, Minchan Kim, Joonsoo Kim, Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On Wed, 22 Jun 2016 14:56:17 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > On Tue, 21 Jun 2016 18:22:49 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > > > If the memory compaction free scanner cannot successfully split a free > > page (only possible due to per-zone low watermark), terminate the free > > scanner rather than continuing to scan memory needlessly. If the > > watermark is insufficient for a free page of order <= cc->order, then > > terminate the scanner since all future splits will also likely fail. > > > > This prevents the compaction freeing scanner from scanning all memory on > > very large zones (very noticeable for zones > 128GB, for instance) when > > all splits will likely fail while holding zone->lock. > > > > This collides pretty heavily with Joonsoo's "mm/compaction: split > freepages without holding the zone lock". > > I ended up with this, in isolate_freepages_block(): > > /* Found a free page, will break it into order-0 pages */ > order = page_order(page); > isolated = __isolate_free_page(page, page_order(page)); > set_page_private(page, order); > > total_isolated += isolated; > cc->nr_freepages += isolated; > list_add_tail(&page->lru, freelist); > > if (!strict && cc->nr_migratepages <= cc->nr_freepages) { > blockpfn += isolated; > break; > } > /* Advance to the end of split page */ > blockpfn += isolated - 1; > cursor += isolated - 1; > continue; > > isolate_fail: > And mm-compaction-split-freepages-without-holding-the-zone-lock-fix.patch churns things around some more. Now this: /* Found a free page, will break it into order-0 pages */ order = page_order(page); isolated = __isolate_free_page(page, order); set_page_private(page, order); total_isolated += isolated; list_add_tail(&page->lru, freelist); cc->nr_freepages += isolated; if (!strict && cc->nr_migratepages <= cc->nr_freepages) { blockpfn += isolated; break; } /* Advance to the end of split page */ blockpfn += isolated - 1; cursor += isolated - 1; continue; isolate_fail: and things are looking a bit better... ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 21:59 ` Andrew Morton @ 2016-06-22 23:40 ` David Rientjes 2016-06-23 11:21 ` Vlastimil Babka 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2016-06-22 23:40 UTC (permalink / raw) To: Andrew Morton Cc: Vlastimil Babka, Minchan Kim, Joonsoo Kim, Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On Wed, 22 Jun 2016, Andrew Morton wrote: > And > mm-compaction-split-freepages-without-holding-the-zone-lock-fix.patch > churns things around some more. Now this: > > > /* Found a free page, will break it into order-0 pages */ > order = page_order(page); > isolated = __isolate_free_page(page, order); > set_page_private(page, order); > total_isolated += isolated; > list_add_tail(&page->lru, freelist); > cc->nr_freepages += isolated; > if (!strict && cc->nr_migratepages <= cc->nr_freepages) { > blockpfn += isolated; > break; > } > /* Advance to the end of split page */ > blockpfn += isolated - 1; > cursor += isolated - 1; > continue; > > isolate_fail: > > and things are looking a bit better... > This looks like it's missing the if (!isolated) break; check from mm-compaction-abort-free-scanner-if-split-fails.patch which is needed to properly terminate when the low watermark fails (and adding to freelist as Minchan mentioned before I saw this patch). I rebased mm-compaction-split-freepages-without-holding-the-zone-lock.patch as I thought it should be done and folded mm-compaction-split-freepages-without-holding-the-zone-lock-fix.patch into it for simplicity. I think this should replace mm-compaction-split-freepages-without-holding-the-zone-lock.patch in -mm. From: Joonsoo Kim <iamjoonsoo.kim@lge.com> We don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> --- diff --git a/include/linux/mm.h b/include/linux/mm.h --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -537,7 +537,6 @@ void __put_page(struct page *page); void put_pages_list(struct list_head *pages); void split_page(struct page *page, unsigned int order); -int split_free_page(struct page *page); /* * Compound pages have a destructor function. Provide a diff --git a/mm/compaction.c b/mm/compaction.c index ab21497..9d17b21 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -65,13 +65,31 @@ static unsigned long release_freepages(struct list_head *freelist) static void map_pages(struct list_head *list) { - struct page *page; + unsigned int i, order, nr_pages; + struct page *page, *next; + LIST_HEAD(tmp_list); + + list_for_each_entry_safe(page, next, list, lru) { + list_del(&page->lru); - list_for_each_entry(page, list, lru) { - arch_alloc_page(page, 0); - kernel_map_pages(page, 1, 1); - kasan_alloc_pages(page, 0); + order = page_private(page); + nr_pages = 1 << order; + set_page_private(page, 0); + set_page_refcounted(page); + + arch_alloc_page(page, order); + kernel_map_pages(page, nr_pages, 1); + kasan_alloc_pages(page, order); + if (order) + split_page(page, order); + + for (i = 0; i < nr_pages; i++) { + list_add(&page->lru, &tmp_list); + page++; + } } + + list_splice(&tmp_list, list); } static inline bool migrate_async_suitable(int migratetype) @@ -406,12 +424,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, unsigned long flags = 0; bool locked = false; unsigned long blockpfn = *start_pfn; + unsigned int order; cursor = pfn_to_page(blockpfn); /* Isolate free pages. */ for (; blockpfn < end_pfn; blockpfn++, cursor++) { - int isolated, i; + int isolated; struct page *page = cursor; /* @@ -477,17 +496,17 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, goto isolate_fail; } - /* Found a free page, break it into order-0 pages */ - isolated = split_free_page(page); + /* Found a free page, will break it into order-0 pages */ + order = page_order(page); + isolated = __isolate_free_page(page, order); if (!isolated) break; + set_page_private(page, order); total_isolated += isolated; cc->nr_freepages += isolated; - for (i = 0; i < isolated; i++) { - list_add(&page->lru, freelist); - page++; - } + list_add_tail(&page->lru, freelist); + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { blockpfn += isolated; break; @@ -606,7 +625,7 @@ isolate_freepages_range(struct compact_control *cc, */ } - /* split_free_page does not map the pages */ + /* __isolate_free_page() does not map the pages */ map_pages(&freelist); if (pfn < end_pfn) { @@ -1113,7 +1132,7 @@ static void isolate_freepages(struct compact_control *cc) } } - /* split_free_page does not map the pages */ + /* __isolate_free_page() does not map the pages */ map_pages(freelist); /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2560,33 +2560,6 @@ int __isolate_free_page(struct page *page, unsigned int order) } /* - * Similar to split_page except the page is already free. As this is only - * being used for migration, the migratetype of the block also changes. - * As this is called with interrupts disabled, the caller is responsible - * for calling arch_alloc_page() and kernel_map_page() after interrupts - * are enabled. - * - * Note: this is probably too low level an operation for use in drivers. - * Please consult with lkml before using this in your driver. - */ -int split_free_page(struct page *page) -{ - unsigned int order; - int nr_pages; - - order = page_order(page); - - nr_pages = __isolate_free_page(page, order); - if (!nr_pages) - return 0; - - /* Split into individual pages */ - set_page_refcounted(page); - split_page(page, order); - return nr_pages; -} - -/* * Update NUMA hit/miss statistics * * Must be called with interrupts disabled. ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 23:40 ` David Rientjes @ 2016-06-23 11:21 ` Vlastimil Babka 0 siblings, 0 replies; 13+ messages in thread From: Vlastimil Babka @ 2016-06-23 11:21 UTC (permalink / raw) To: David Rientjes, Andrew Morton Cc: Minchan Kim, Joonsoo Kim, Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On 06/23/2016 01:40 AM, David Rientjes wrote: > On Wed, 22 Jun 2016, Andrew Morton wrote: > >> And >> mm-compaction-split-freepages-without-holding-the-zone-lock-fix.patch >> churns things around some more. Now this: >> >> >> /* Found a free page, will break it into order-0 pages */ >> order = page_order(page); >> isolated = __isolate_free_page(page, order); >> set_page_private(page, order); >> total_isolated += isolated; >> list_add_tail(&page->lru, freelist); >> cc->nr_freepages += isolated; >> if (!strict && cc->nr_migratepages <= cc->nr_freepages) { >> blockpfn += isolated; >> break; >> } >> /* Advance to the end of split page */ >> blockpfn += isolated - 1; >> cursor += isolated - 1; >> continue; >> >> isolate_fail: >> >> and things are looking a bit better... >> > > This looks like it's missing the > > if (!isolated) > break; > > check from mm-compaction-abort-free-scanner-if-split-fails.patch which is > needed to properly terminate when the low watermark fails (and adding to > freelist as Minchan mentioned before I saw this patch). Agreed. > > I rebased > mm-compaction-split-freepages-without-holding-the-zone-lock.patch as I > thought it should be done and folded > mm-compaction-split-freepages-without-holding-the-zone-lock-fix.patch into > it for simplicity. I think this should replace > mm-compaction-split-freepages-without-holding-the-zone-lock.patch in -mm. Yes, it should replace both the .patch and the -fix.patch. Thanks! > > From: Joonsoo Kim <iamjoonsoo.kim@lge.com> > > We don't need to split freepages with holding the zone lock. It will > cause more contention on zone lock so not desirable. > > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > diff --git a/include/linux/mm.h b/include/linux/mm.h > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -537,7 +537,6 @@ void __put_page(struct page *page); > void put_pages_list(struct list_head *pages); > > void split_page(struct page *page, unsigned int order); > -int split_free_page(struct page *page); > > /* > * Compound pages have a destructor function. Provide a > diff --git a/mm/compaction.c b/mm/compaction.c > index ab21497..9d17b21 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -65,13 +65,31 @@ static unsigned long release_freepages(struct list_head *freelist) > > static void map_pages(struct list_head *list) > { > - struct page *page; > + unsigned int i, order, nr_pages; > + struct page *page, *next; > + LIST_HEAD(tmp_list); > + > + list_for_each_entry_safe(page, next, list, lru) { > + list_del(&page->lru); > > - list_for_each_entry(page, list, lru) { > - arch_alloc_page(page, 0); > - kernel_map_pages(page, 1, 1); > - kasan_alloc_pages(page, 0); > + order = page_private(page); > + nr_pages = 1 << order; > + set_page_private(page, 0); > + set_page_refcounted(page); > + > + arch_alloc_page(page, order); > + kernel_map_pages(page, nr_pages, 1); > + kasan_alloc_pages(page, order); > + if (order) > + split_page(page, order); > + > + for (i = 0; i < nr_pages; i++) { > + list_add(&page->lru, &tmp_list); > + page++; > + } > } > + > + list_splice(&tmp_list, list); > } > > static inline bool migrate_async_suitable(int migratetype) > @@ -406,12 +424,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > unsigned long flags = 0; > bool locked = false; > unsigned long blockpfn = *start_pfn; > + unsigned int order; > > cursor = pfn_to_page(blockpfn); > > /* Isolate free pages. */ > for (; blockpfn < end_pfn; blockpfn++, cursor++) { > - int isolated, i; > + int isolated; > struct page *page = cursor; > > /* > @@ -477,17 +496,17 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > goto isolate_fail; > } > > - /* Found a free page, break it into order-0 pages */ > - isolated = split_free_page(page); > + /* Found a free page, will break it into order-0 pages */ > + order = page_order(page); > + isolated = __isolate_free_page(page, order); > if (!isolated) > break; > + set_page_private(page, order); > > total_isolated += isolated; > cc->nr_freepages += isolated; > - for (i = 0; i < isolated; i++) { > - list_add(&page->lru, freelist); > - page++; > - } > + list_add_tail(&page->lru, freelist); > + > if (!strict && cc->nr_migratepages <= cc->nr_freepages) { > blockpfn += isolated; > break; > @@ -606,7 +625,7 @@ isolate_freepages_range(struct compact_control *cc, > */ > } > > - /* split_free_page does not map the pages */ > + /* __isolate_free_page() does not map the pages */ > map_pages(&freelist); > > if (pfn < end_pfn) { > @@ -1113,7 +1132,7 @@ static void isolate_freepages(struct compact_control *cc) > } > } > > - /* split_free_page does not map the pages */ > + /* __isolate_free_page() does not map the pages */ > map_pages(freelist); > > /* > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2560,33 +2560,6 @@ int __isolate_free_page(struct page *page, unsigned int order) > } > > /* > - * Similar to split_page except the page is already free. As this is only > - * being used for migration, the migratetype of the block also changes. > - * As this is called with interrupts disabled, the caller is responsible > - * for calling arch_alloc_page() and kernel_map_page() after interrupts > - * are enabled. > - * > - * Note: this is probably too low level an operation for use in drivers. > - * Please consult with lkml before using this in your driver. > - */ > -int split_free_page(struct page *page) > -{ > - unsigned int order; > - int nr_pages; > - > - order = page_order(page); > - > - nr_pages = __isolate_free_page(page, order); > - if (!nr_pages) > - return 0; > - > - /* Split into individual pages */ > - set_page_refcounted(page); > - split_page(page, order); > - return nr_pages; > -} > - > -/* > * Update NUMA hit/miss statistics > * > * Must be called with interrupts disabled. > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 21:56 ` Andrew Morton 2016-06-22 21:59 ` Andrew Morton @ 2016-06-22 22:06 ` David Rientjes 2016-06-22 22:42 ` Andrew Morton 1 sibling, 1 reply; 13+ messages in thread From: David Rientjes @ 2016-06-22 22:06 UTC (permalink / raw) To: Andrew Morton Cc: Vlastimil Babka, Minchan Kim, Joonsoo Kim, Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On Wed, 22 Jun 2016, Andrew Morton wrote: > On Tue, 21 Jun 2016 18:22:49 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > > > If the memory compaction free scanner cannot successfully split a free > > page (only possible due to per-zone low watermark), terminate the free > > scanner rather than continuing to scan memory needlessly. If the > > watermark is insufficient for a free page of order <= cc->order, then > > terminate the scanner since all future splits will also likely fail. > > > > This prevents the compaction freeing scanner from scanning all memory on > > very large zones (very noticeable for zones > 128GB, for instance) when > > all splits will likely fail while holding zone->lock. > > > > This collides pretty heavily with Joonsoo's "mm/compaction: split > freepages without holding the zone lock". > Sorry if it wasn't clear, but I was proposing this patch for 4.7 inclusion and Vlastimil agreed we should ask for that. Joonsoo said he was prepared to rebase on top of that. Is mm-compaction-split-freepages-without-holding-the-zone-lock.patch and friends going into 4.7 or are we deferring this fix until 4.8? compaction_alloc() iterating a 128GB zone has been benchmarked to take over 400ms on some systems whereas any free page isolated and ready to be split ends up failing in split_free_page() because of the low watermark check and thus the iteration continues. The next time compaction occurs, the freeing scanner will likely start at the end of the zone again since no success was made previously and we get the same lengthy iteration until the zone is brought above the low watermark. All thp page faults can take >400ms in such a state without this fix. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-22 22:06 ` David Rientjes @ 2016-06-22 22:42 ` Andrew Morton 0 siblings, 0 replies; 13+ messages in thread From: Andrew Morton @ 2016-06-22 22:42 UTC (permalink / raw) To: David Rientjes Cc: Vlastimil Babka, Minchan Kim, Joonsoo Kim, Mel Gorman, Hugh Dickins, linux-kernel, linux-mm, stable On Wed, 22 Jun 2016 15:06:29 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > On Wed, 22 Jun 2016, Andrew Morton wrote: > > > On Tue, 21 Jun 2016 18:22:49 -0700 (PDT) David Rientjes <rientjes@google.com> wrote: > > > > > If the memory compaction free scanner cannot successfully split a free > > > page (only possible due to per-zone low watermark), terminate the free > > > scanner rather than continuing to scan memory needlessly. If the > > > watermark is insufficient for a free page of order <= cc->order, then > > > terminate the scanner since all future splits will also likely fail. > > > > > > This prevents the compaction freeing scanner from scanning all memory on > > > very large zones (very noticeable for zones > 128GB, for instance) when > > > all splits will likely fail while holding zone->lock. > > > > > > > This collides pretty heavily with Joonsoo's "mm/compaction: split > > freepages without holding the zone lock". > > > > Sorry if it wasn't clear, but I was proposing this patch for 4.7 > inclusion and Vlastimil agreed we should ask for that. Joonsoo said he > was prepared to rebase on top of that. Is > mm-compaction-split-freepages-without-holding-the-zone-lock.patch and > friends going into 4.7 or are we deferring this fix until 4.8? I have this patch lined up for 4.7 so I was rebasing Joonsoo's patches on top. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch] mm, compaction: ignore watermarks when isolating free pages @ 2016-06-15 22:34 David Rientjes 2016-06-16 7:15 ` Vlastimil Babka 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2016-06-15 22:34 UTC (permalink / raw) To: Andrew Morton, Mel Gorman Cc: Hugh Dickins, Vlastimil Babka, Joonsoo Kim, linux-kernel, linux-mm The goal of memory compaction is to defragment memory by moving migratable pages to free pages at the end of the zone. No additional memory is being allocated. Ignore per-zone low watermarks in __isolate_free_page() because memory is either fully migrated or isolated free pages are returned when migration fails. This fixes an issue where the compaction freeing scanner can isolate memory but the zone drops below its low watermark for that page order, so the scanner must continue to scan all memory pointlessly. Signed-off-by: David Rientjes <rientjes@google.com> --- mm/page_alloc.c | 14 ++------------ 1 file changed, 2 insertions(+), 12 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2484,23 +2484,14 @@ EXPORT_SYMBOL_GPL(split_page); int __isolate_free_page(struct page *page, unsigned int order) { - unsigned long watermark; struct zone *zone; - int mt; + const int mt = get_pageblock_migratetype(page); BUG_ON(!PageBuddy(page)); - zone = page_zone(page); - mt = get_pageblock_migratetype(page); - - if (!is_migrate_isolate(mt)) { - /* Obey watermarks as if the page was being allocated */ - watermark = low_wmark_pages(zone) + (1 << order); - if (!zone_watermark_ok(zone, 0, watermark, 0, 0)) - return 0; + if (!is_migrate_isolate(mt)) __mod_zone_freepage_state(zone, -(1UL << order), mt); - } /* Remove page from free list */ list_del(&page->lru); @@ -2520,7 +2511,6 @@ int __isolate_free_page(struct page *page, unsigned int order) } } - return 1UL << order; } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: ignore watermarks when isolating free pages 2016-06-15 22:34 [patch] mm, compaction: ignore watermarks when isolating free pages David Rientjes @ 2016-06-16 7:15 ` Vlastimil Babka 2016-06-20 22:27 ` [patch] mm, compaction: abort free scanner if split fails David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Vlastimil Babka @ 2016-06-16 7:15 UTC (permalink / raw) To: David Rientjes, Andrew Morton, Mel Gorman Cc: Hugh Dickins, Joonsoo Kim, linux-kernel, linux-mm, Michal Hocko On 06/16/2016 12:34 AM, David Rientjes wrote: > The goal of memory compaction is to defragment memory by moving migratable > pages to free pages at the end of the zone. No additional memory is being > allocated. > > Ignore per-zone low watermarks in __isolate_free_page() because memory is > either fully migrated or isolated free pages are returned when migration > fails. Michal Hocko suggested that too, but I didn't think it safe that compaction should go below the min watermark, even temporarily. It means the system is struggling with order-0 allocations, so making it worse for the benefit of high-order allocations doesn't make sense. The high-order allocation would likely fail anyway due to watermark checks, even if the page of sufficient order was formed by compaction. So in my series, I just changed the low watermark check to min [1]. > This fixes an issue where the compaction freeing scanner can isolate > memory but the zone drops below its low watermark for that page order, so > the scanner must continue to scan all memory pointlessly. Good point, looks like failing the watermark is the only reason when __isolate_free_page() can fail. isolate_freepages_block() and its callers should take this as an indication that compaction should return with failure immediately. [1] http://article.gmane.org/gmane.linux.kernel/2231369 > Signed-off-by: David Rientjes <rientjes@google.com> > --- > mm/page_alloc.c | 14 ++------------ > 1 file changed, 2 insertions(+), 12 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -2484,23 +2484,14 @@ EXPORT_SYMBOL_GPL(split_page); > > int __isolate_free_page(struct page *page, unsigned int order) > { > - unsigned long watermark; > struct zone *zone; > - int mt; > + const int mt = get_pageblock_migratetype(page); > > BUG_ON(!PageBuddy(page)); > - > zone = page_zone(page); > - mt = get_pageblock_migratetype(page); > - > - if (!is_migrate_isolate(mt)) { > - /* Obey watermarks as if the page was being allocated */ > - watermark = low_wmark_pages(zone) + (1 << order); > - if (!zone_watermark_ok(zone, 0, watermark, 0, 0)) > - return 0; > > + if (!is_migrate_isolate(mt)) > __mod_zone_freepage_state(zone, -(1UL << order), mt); > - } > > /* Remove page from free list */ > list_del(&page->lru); > @@ -2520,7 +2511,6 @@ int __isolate_free_page(struct page *page, unsigned int order) > } > } > > - > return 1UL << order; > } > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [patch] mm, compaction: abort free scanner if split fails 2016-06-16 7:15 ` Vlastimil Babka @ 2016-06-20 22:27 ` David Rientjes 2016-06-21 11:43 ` Vlastimil Babka 0 siblings, 1 reply; 13+ messages in thread From: David Rientjes @ 2016-06-20 22:27 UTC (permalink / raw) To: Andrew Morton, Vlastimil Babka Cc: Mel Gorman, Hugh Dickins, Joonsoo Kim, linux-kernel, linux-mm If the memory compaction free scanner cannot successfully split a free page (only possible due to per-zone low watermark), terminate the free scanner rather than continuing to scan memory needlessly. If the per-zone watermark is insufficient for a free page of order <= cc->order, then terminate the scanner since future splits will also likely fail. This prevents the compaction freeing scanner from scanning all memory on very large zones (very noticeable for zones > 128GB, for instance) when all splits will likely fail. Signed-off-by: David Rientjes <rientjes@google.com> --- mm/compaction.c | 49 +++++++++++++++++++++++++++++-------------------- 1 file changed, 29 insertions(+), 20 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c --- a/mm/compaction.c +++ b/mm/compaction.c @@ -494,24 +494,22 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, /* Found a free page, will break it into order-0 pages */ order = page_order(page); - isolated = __isolate_free_page(page, page_order(page)); + isolated = __isolate_free_page(page, order); + if (!isolated) + break; set_page_private(page, order); total_isolated += isolated; list_add_tail(&page->lru, freelist); - /* If a page was split, advance to the end of it */ - if (isolated) { - cc->nr_freepages += isolated; - if (!strict && - cc->nr_migratepages <= cc->nr_freepages) { - blockpfn += isolated; - break; - } - - blockpfn += isolated - 1; - cursor += isolated - 1; - continue; + /* Advance to the end of split page */ + cc->nr_freepages += isolated; + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { + blockpfn += isolated; + break; } + blockpfn += isolated - 1; + cursor += isolated - 1; + continue; isolate_fail: if (strict) @@ -521,6 +519,9 @@ isolate_fail: } + if (locked) + spin_unlock_irqrestore(&cc->zone->lock, flags); + /* * There is a tiny chance that we have read bogus compound_order(), * so be careful to not go outside of the pageblock. @@ -542,9 +543,6 @@ isolate_fail: if (strict && blockpfn < end_pfn) total_isolated = 0; - if (locked) - spin_unlock_irqrestore(&cc->zone->lock, flags); - /* Update the pageblock-skip if the whole pageblock was scanned */ if (blockpfn == end_pfn) update_pageblock_skip(cc, valid_page, total_isolated, false); @@ -622,7 +620,7 @@ isolate_freepages_range(struct compact_control *cc, */ } - /* split_free_page does not map the pages */ + /* __isolate_free_page() does not map the pages */ map_pages(&freelist); if (pfn < end_pfn) { @@ -1071,6 +1069,7 @@ static void isolate_freepages(struct compact_control *cc) block_end_pfn = block_start_pfn, block_start_pfn -= pageblock_nr_pages, isolate_start_pfn = block_start_pfn) { + unsigned long isolated; /* * This can iterate a massively long zone without finding any @@ -1095,8 +1094,12 @@ static void isolate_freepages(struct compact_control *cc) continue; /* Found a block suitable for isolating free pages from. */ - isolate_freepages_block(cc, &isolate_start_pfn, - block_end_pfn, freelist, false); + isolated = isolate_freepages_block(cc, &isolate_start_pfn, + block_end_pfn, freelist, false); + /* If free page split failed, do not continue needlessly */ + if (!isolated && isolate_start_pfn < block_end_pfn && + cc->nr_freepages <= cc->nr_migratepages) + break; /* * If we isolated enough freepages, or aborted due to async @@ -1124,7 +1127,7 @@ static void isolate_freepages(struct compact_control *cc) } } - /* split_free_page does not map the pages */ + /* __isolate_free_page() does not map the pages */ map_pages(freelist); /* @@ -1703,6 +1706,12 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order, continue; } + /* Don't attempt compaction if splitting free page will fail */ + if (!zone_watermark_ok(zone, 0, + low_wmark_pages(zone) + (1 << order), + 0, 0)) + continue; + status = compact_zone_order(zone, order, gfp_mask, mode, &zone_contended, alloc_flags, ac_classzone_idx(ac)); ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-20 22:27 ` [patch] mm, compaction: abort free scanner if split fails David Rientjes @ 2016-06-21 11:43 ` Vlastimil Babka 2016-06-21 20:43 ` David Rientjes 0 siblings, 1 reply; 13+ messages in thread From: Vlastimil Babka @ 2016-06-21 11:43 UTC (permalink / raw) To: David Rientjes, Andrew Morton Cc: Mel Gorman, Hugh Dickins, Joonsoo Kim, linux-kernel, linux-mm, Minchan Kim On 06/21/2016 12:27 AM, David Rientjes wrote: > If the memory compaction free scanner cannot successfully split a free > page (only possible due to per-zone low watermark), terminate the free > scanner rather than continuing to scan memory needlessly. > > If the per-zone watermark is insufficient for a free page of > order <= cc->order, then terminate the scanner since future splits will > also likely fail. > > This prevents the compaction freeing scanner from scanning all memory on > very large zones (very noticeable for zones > 128GB, for instance) when > all splits will likely fail. > > Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> But some notes below. > --- > mm/compaction.c | 49 +++++++++++++++++++++++++++++-------------------- > 1 file changed, 29 insertions(+), 20 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -494,24 +494,22 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, > > /* Found a free page, will break it into order-0 pages */ > order = page_order(page); > - isolated = __isolate_free_page(page, page_order(page)); > + isolated = __isolate_free_page(page, order); > + if (!isolated) > + break; This seems to fix as a side-effect a bug in Joonsoo's mmotm patch mm-compaction-split-freepages-without-holding-the-zone-lock.patch, that Minchan found: http://marc.info/?l=linux-mm&m=146607176528495&w=2 So it should be noted somewhere so they are merged together. Or Joonsoo posts an isolated fix and this patch has to rebase. > set_page_private(page, order); > total_isolated += isolated; > list_add_tail(&page->lru, freelist); > > - /* If a page was split, advance to the end of it */ > - if (isolated) { > - cc->nr_freepages += isolated; > - if (!strict && > - cc->nr_migratepages <= cc->nr_freepages) { > - blockpfn += isolated; > - break; > - } > - > - blockpfn += isolated - 1; > - cursor += isolated - 1; > - continue; > + /* Advance to the end of split page */ > + cc->nr_freepages += isolated; > + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { > + blockpfn += isolated; > + break; > } > + blockpfn += isolated - 1; > + cursor += isolated - 1; > + continue; > > isolate_fail: > if (strict) > @@ -521,6 +519,9 @@ isolate_fail: > > } > > + if (locked) > + spin_unlock_irqrestore(&cc->zone->lock, flags); > + > /* > * There is a tiny chance that we have read bogus compound_order(), > * so be careful to not go outside of the pageblock. > @@ -542,9 +543,6 @@ isolate_fail: > if (strict && blockpfn < end_pfn) > total_isolated = 0; > > - if (locked) > - spin_unlock_irqrestore(&cc->zone->lock, flags); > - > /* Update the pageblock-skip if the whole pageblock was scanned */ > if (blockpfn == end_pfn) > update_pageblock_skip(cc, valid_page, total_isolated, false); > @@ -622,7 +620,7 @@ isolate_freepages_range(struct compact_control *cc, > */ > } > > - /* split_free_page does not map the pages */ > + /* __isolate_free_page() does not map the pages */ > map_pages(&freelist); > > if (pfn < end_pfn) { > @@ -1071,6 +1069,7 @@ static void isolate_freepages(struct compact_control *cc) > block_end_pfn = block_start_pfn, > block_start_pfn -= pageblock_nr_pages, > isolate_start_pfn = block_start_pfn) { > + unsigned long isolated; > > /* > * This can iterate a massively long zone without finding any > @@ -1095,8 +1094,12 @@ static void isolate_freepages(struct compact_control *cc) > continue; > > /* Found a block suitable for isolating free pages from. */ > - isolate_freepages_block(cc, &isolate_start_pfn, > - block_end_pfn, freelist, false); > + isolated = isolate_freepages_block(cc, &isolate_start_pfn, > + block_end_pfn, freelist, false); > + /* If free page split failed, do not continue needlessly */ More accurately, free page isolation failed? > + if (!isolated && isolate_start_pfn < block_end_pfn && > + cc->nr_freepages <= cc->nr_migratepages) > + break; > > /* > * If we isolated enough freepages, or aborted due to async > @@ -1124,7 +1127,7 @@ static void isolate_freepages(struct compact_control *cc) > } > } > > - /* split_free_page does not map the pages */ > + /* __isolate_free_page() does not map the pages */ > map_pages(freelist); > > /* > @@ -1703,6 +1706,12 @@ enum compact_result try_to_compact_pages(gfp_t gfp_mask, unsigned int order, > continue; > } > > + /* Don't attempt compaction if splitting free page will fail */ > + if (!zone_watermark_ok(zone, 0, > + low_wmark_pages(zone) + (1 << order), > + 0, 0)) > + continue; > + Please don't add this, compact_zone already checks this via compaction_suitable() (and the usual 2 << order gap), so this is adding yet another watermark check with a different kind of gap. Thanks. > status = compact_zone_order(zone, order, gfp_mask, mode, > &zone_contended, alloc_flags, > ac_classzone_idx(ac)); > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [patch] mm, compaction: abort free scanner if split fails 2016-06-21 11:43 ` Vlastimil Babka @ 2016-06-21 20:43 ` David Rientjes 0 siblings, 0 replies; 13+ messages in thread From: David Rientjes @ 2016-06-21 20:43 UTC (permalink / raw) To: Vlastimil Babka Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Joonsoo Kim, linux-kernel, linux-mm, Minchan Kim On Tue, 21 Jun 2016, Vlastimil Babka wrote: > > diff --git a/mm/compaction.c b/mm/compaction.c > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -494,24 +494,22 @@ static unsigned long isolate_freepages_block(struct > > compact_control *cc, > > > > /* Found a free page, will break it into order-0 pages */ > > order = page_order(page); > > - isolated = __isolate_free_page(page, page_order(page)); > > + isolated = __isolate_free_page(page, order); > > + if (!isolated) > > + break; > > This seems to fix as a side-effect a bug in Joonsoo's mmotm patch > mm-compaction-split-freepages-without-holding-the-zone-lock.patch, that > Minchan found: http://marc.info/?l=linux-mm&m=146607176528495&w=2 > > So it should be noted somewhere so they are merged together. Or Joonsoo posts > an isolated fix and this patch has to rebase. > Indeed, I hadn't noticed the differences between Linus's tree and -mm. Thanks very much for pointing it out. My interest is to eventually backport this to a much older kernel where we suffer from the same issue: it seems that we have always not terminated the freeing scanner when splitting the free page fails and we feel it because some of our systems have 128GB zones and migrate_pages() can call compaction_alloc() several times if it keeps getting -EAGAIN. It's very expensive. I'm not sure we should label it as a -fix for mm-compaction-split-freepages-without-holding-the-zone-lock.patch since the problem this patch is addressing has seemingly existed for years. Perhaps it would be better to have two patches, one as a -fix and then the abort on page split failure on top. I'll send out a two patch series in this form. > > set_page_private(page, order); > > total_isolated += isolated; > > list_add_tail(&page->lru, freelist); > > > > - /* If a page was split, advance to the end of it */ > > - if (isolated) { > > - cc->nr_freepages += isolated; > > - if (!strict && > > - cc->nr_migratepages <= cc->nr_freepages) { > > - blockpfn += isolated; > > - break; > > - } > > - > > - blockpfn += isolated - 1; > > - cursor += isolated - 1; > > - continue; > > + /* Advance to the end of split page */ > > + cc->nr_freepages += isolated; > > + if (!strict && cc->nr_migratepages <= cc->nr_freepages) { > > + blockpfn += isolated; > > + break; > > } > > + blockpfn += isolated - 1; > > + cursor += isolated - 1; > > + continue; > > > > isolate_fail: > > if (strict) > > @@ -521,6 +519,9 @@ isolate_fail: > > > > } > > > > + if (locked) > > + spin_unlock_irqrestore(&cc->zone->lock, flags); > > + > > /* > > * There is a tiny chance that we have read bogus compound_order(), > > * so be careful to not go outside of the pageblock. > > @@ -542,9 +543,6 @@ isolate_fail: > > if (strict && blockpfn < end_pfn) > > total_isolated = 0; > > > > - if (locked) > > - spin_unlock_irqrestore(&cc->zone->lock, flags); > > - > > /* Update the pageblock-skip if the whole pageblock was scanned */ > > if (blockpfn == end_pfn) > > update_pageblock_skip(cc, valid_page, total_isolated, false); > > @@ -622,7 +620,7 @@ isolate_freepages_range(struct compact_control *cc, > > */ > > } > > > > - /* split_free_page does not map the pages */ > > + /* __isolate_free_page() does not map the pages */ > > map_pages(&freelist); > > > > if (pfn < end_pfn) { > > @@ -1071,6 +1069,7 @@ static void isolate_freepages(struct compact_control > > *cc) > > block_end_pfn = block_start_pfn, > > block_start_pfn -= pageblock_nr_pages, > > isolate_start_pfn = block_start_pfn) { > > + unsigned long isolated; > > > > /* > > * This can iterate a massively long zone without finding any > > @@ -1095,8 +1094,12 @@ static void isolate_freepages(struct compact_control > > *cc) > > continue; > > > > /* Found a block suitable for isolating free pages from. */ > > - isolate_freepages_block(cc, &isolate_start_pfn, > > - block_end_pfn, freelist, false); > > + isolated = isolate_freepages_block(cc, &isolate_start_pfn, > > + block_end_pfn, freelist, > > false); > > + /* If free page split failed, do not continue needlessly */ > > More accurately, free page isolation failed? > Eek, maybe. The condition should only work if we terminated early because - need_resched() or zone->lock contention for MIGRATE_ASYNC, or - __isolate_free_page() fails. And the latter can only fail because of this (somewhat arbitrary) split watermark check. I'll rename it because it includes both, but I thought the next immediate condition check for cc->contended and its comment was explanatory enough. > > + if (!isolated && isolate_start_pfn < block_end_pfn && > > + cc->nr_freepages <= cc->nr_migratepages) > > + break; > > > > /* > > * If we isolated enough freepages, or aborted due to async > > @@ -1124,7 +1127,7 @@ static void isolate_freepages(struct compact_control > > *cc) > > } > > } > > > > - /* split_free_page does not map the pages */ > > + /* __isolate_free_page() does not map the pages */ > > map_pages(freelist); > > > > /* > > @@ -1703,6 +1706,12 @@ enum compact_result try_to_compact_pages(gfp_t > > gfp_mask, unsigned int order, > > continue; > > } > > > > + /* Don't attempt compaction if splitting free page will fail > > */ > > + if (!zone_watermark_ok(zone, 0, > > + low_wmark_pages(zone) + (1 << order), > > + 0, 0)) > > + continue; > > + > > Please don't add this, compact_zone already checks this via > compaction_suitable() (and the usual 2 << order gap), so this is adding yet > another watermark check with a different kind of gap. > Good point, thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2016-06-23 11:22 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-21 21:47 [patch -mm 1/2] mm/compaction: split freepages without holding the zone lock fix David Rientjes 2016-06-21 21:47 ` [patch -mm 2/2] mm, compaction: abort free scanner if split fails David Rientjes 2016-06-22 1:22 ` [patch] " David Rientjes 2016-06-22 11:02 ` Vlastimil Babka 2016-06-22 21:56 ` Andrew Morton 2016-06-22 21:59 ` Andrew Morton 2016-06-22 23:40 ` David Rientjes 2016-06-23 11:21 ` Vlastimil Babka 2016-06-22 22:06 ` David Rientjes 2016-06-22 22:42 ` Andrew Morton -- strict thread matches above, loose matches on Subject: below -- 2016-06-15 22:34 [patch] mm, compaction: ignore watermarks when isolating free pages David Rientjes 2016-06-16 7:15 ` Vlastimil Babka 2016-06-20 22:27 ` [patch] mm, compaction: abort free scanner if split fails David Rientjes 2016-06-21 11:43 ` Vlastimil Babka 2016-06-21 20:43 ` David Rientjes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).