* kswapd endless loop for compaction
@ 2012-11-20 19:04 Johannes Weiner
2012-11-21 22:01 ` Johannes Weiner
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Johannes Weiner @ 2012-11-20 19:04 UTC (permalink / raw)
To: Rik van Riel, Mel Gorman; +Cc: linux-mm, linux-kernel
Hi guys,
while testing a 3.7-rc5ish kernel, I noticed that kswapd can drop into
a busy spin state without doing reclaim. printk-style debugging told
me that this happens when the distance between a zone's high watermark
and its low watermark is less than two huge pages (DMA zone).
1. The first loop in balance_pgdat() over the zones finds all zones to
be above their high watermark and only does goto out (all_zones_ok).
2. pgdat_balanced() at the out: label also just checks the high
watermark, so the node is considered balanced and the order is not
reduced.
3. In the `if (order)' block after it, compaction_suitable() checks if
the zone's low watermark + twice the huge page size is okay, which
it's not necessarily in a small zone, and so COMPACT_SKIPPED makes it
it go back to loop_again:.
This will go on until somebody else allocates and breaches the high
watermark and then hopefully goes on to reclaim the zone above low
watermark + 2 * THP.
I'm not really sure what the correct solution is. Should we modify
the zone_watermark_ok() checks in balance_pgdat() to take into account
the higher watermark requirements for reclaim on behalf of compaction?
Change the check in compaction_suitable() / not use it directly?
Thanks,
Johannes
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kswapd endless loop for compaction
2012-11-20 19:04 kswapd endless loop for compaction Johannes Weiner
@ 2012-11-21 22:01 ` Johannes Weiner
2012-11-22 14:40 ` Jaegeuk Hanse
2012-11-23 8:50 ` Jaegeuk Hanse
2 siblings, 0 replies; 4+ messages in thread
From: Johannes Weiner @ 2012-11-21 22:01 UTC (permalink / raw)
To: Rik van Riel, Mel Gorman; +Cc: linux-mm, linux-kernel
Just to be clear, this is not fixed by Dave's patch to NR_FREE_PAGES
accounting.
I can still get 3.7-rc5 + Dave's fix to drop into an endless loop in
kswapd within a couple of minutes on my test box.
As described below, the bug comes from contradicting conditions in
balance_pgdat(), not an accounting problem.
On Tue, Nov 20, 2012 at 02:04:41PM -0500, Johannes Weiner wrote:
> Hi guys,
>
> while testing a 3.7-rc5ish kernel, I noticed that kswapd can drop into
> a busy spin state without doing reclaim. printk-style debugging told
> me that this happens when the distance between a zone's high watermark
> and its low watermark is less than two huge pages (DMA zone).
>
> 1. The first loop in balance_pgdat() over the zones finds all zones to
> be above their high watermark and only does goto out (all_zones_ok).
>
> 2. pgdat_balanced() at the out: label also just checks the high
> watermark, so the node is considered balanced and the order is not
> reduced.
>
> 3. In the `if (order)' block after it, compaction_suitable() checks if
> the zone's low watermark + twice the huge page size is okay, which
> it's not necessarily in a small zone, and so COMPACT_SKIPPED makes it
> it go back to loop_again:.
>
> This will go on until somebody else allocates and breaches the high
> watermark and then hopefully goes on to reclaim the zone above low
> watermark + 2 * THP.
>
> I'm not really sure what the correct solution is. Should we modify
> the zone_watermark_ok() checks in balance_pgdat() to take into account
> the higher watermark requirements for reclaim on behalf of compaction?
> Change the check in compaction_suitable() / not use it directly?
>
> Thanks,
> Johannes
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kswapd endless loop for compaction
2012-11-20 19:04 kswapd endless loop for compaction Johannes Weiner
2012-11-21 22:01 ` Johannes Weiner
@ 2012-11-22 14:40 ` Jaegeuk Hanse
2012-11-23 8:50 ` Jaegeuk Hanse
2 siblings, 0 replies; 4+ messages in thread
From: Jaegeuk Hanse @ 2012-11-22 14:40 UTC (permalink / raw)
To: Johannes Weiner; +Cc: Rik van Riel, Mel Gorman, linux-mm, linux-kernel
On 11/21/2012 03:04 AM, Johannes Weiner wrote:
> Hi guys,
>
> while testing a 3.7-rc5ish kernel, I noticed that kswapd can drop into
> a busy spin state without doing reclaim. printk-style debugging told
> me that this happens when the distance between a zone's high watermark
> and its low watermark is less than two huge pages (DMA zone).
>
> 1. The first loop in balance_pgdat() over the zones finds all zones to
> be above their high watermark and only does goto out (all_zones_ok).
>
> 2. pgdat_balanced() at the out: label also just checks the high
> watermark, so the node is considered balanced and the order is not
> reduced.
>
> 3. In the `if (order)' block after it, compaction_suitable() checks if
> the zone's low watermark + twice the huge page size is okay, which
> it's not necessarily in a small zone, and so COMPACT_SKIPPED makes it
> it go back to loop_again:.
>
> This will go on until somebody else allocates and breaches the high
> watermark and then hopefully goes on to reclaim the zone above low
> watermark + 2 * THP.
>
> I'm not really sure what the correct solution is. Should we modify
> the zone_watermark_ok() checks in balance_pgdat() to take into account
> the higher watermark requirements for reclaim on behalf of compaction?
> Change the check in compaction_suitable() / not use it directly?
>
Hi Johannes,
- If all zones meet high watermark, goto out, then why go to `if
(order)' block?
- If depend on compaction get enough contigous pages, why
if (CONPACT_BUILD && order &&
compaction_suitable(zone, order) !=
COMPACTION_SKIPPED)
testorder = 0;
can't guarantee low watermark + twice the huge page size is okay?
Regards,
Jaegeuk
>
> Thanks,
> Johannes
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kswapd endless loop for compaction
2012-11-20 19:04 kswapd endless loop for compaction Johannes Weiner
2012-11-21 22:01 ` Johannes Weiner
2012-11-22 14:40 ` Jaegeuk Hanse
@ 2012-11-23 8:50 ` Jaegeuk Hanse
2 siblings, 0 replies; 4+ messages in thread
From: Jaegeuk Hanse @ 2012-11-23 8:50 UTC (permalink / raw)
To: Johannes Weiner; +Cc: Rik van Riel, Mel Gorman, linux-mm, linux-kernel
On 11/21/2012 03:04 AM, Johannes Weiner wrote:
> Hi guys,
>
> while testing a 3.7-rc5ish kernel, I noticed that kswapd can drop into
> a busy spin state without doing reclaim. printk-style debugging told
> me that this happens when the distance between a zone's high watermark
> and its low watermark is less than two huge pages (DMA zone).
>
> 1. The first loop in balance_pgdat() over the zones finds all zones to
> be above their high watermark and only does goto out (all_zones_ok).
>
> 2. pgdat_balanced() at the out: label also just checks the high
> watermark, so the node is considered balanced and the order is not
> reduced.
>
> 3. In the `if (order)' block after it, compaction_suitable() checks if
> the zone's low watermark + twice the huge page size is okay, which
> it's not necessarily in a small zone, and so COMPACT_SKIPPED makes it
> it go back to loop_again:.
>
> This will go on until somebody else allocates and breaches the high
> watermark and then hopefully goes on to reclaim the zone above low
> watermark + 2 * THP.
>
> I'm not really sure what the correct solution is. Should we modify
> the zone_watermark_ok() checks in balance_pgdat() to take into account
> the higher watermark requirements for reclaim on behalf of compaction?
> Change the check in compaction_suitable() / not use it directly?
Hi Johannes,
If depend on compaction get enough contigous pages, why
if (CONPACT_BUILD && order &&
compaction_suitable(zone, order) !=
COMPACTION_SKIPPED)
testorder = 0;
can't guarantee low watermark + twice the huge page size is okay?
Regards,
Jaegeuk
> Thanks,
> Johannes
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-11-23 8:50 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-20 19:04 kswapd endless loop for compaction Johannes Weiner
2012-11-21 22:01 ` Johannes Weiner
2012-11-22 14:40 ` Jaegeuk Hanse
2012-11-23 8:50 ` Jaegeuk Hanse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).