linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shaohua Li <shaohua.li@intel.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Simon Kirby <sim@hostway.ca>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>
Subject: Re: Free memory never fully used, swapping
Date: Fri, 26 Nov 2010 10:00:44 +0800	[thread overview]
Message-ID: <1290736844.12777.10.camel@sli10-conroe> (raw)
In-Reply-To: <20101125161524.GE26037@csn.ul.ie>

On Fri, 2010-11-26 at 00:15 +0800, Mel Gorman wrote:
> On Thu, Nov 25, 2010 at 07:51:44PM +0900, KOSAKI Motohiro wrote:
> > > kswapd is throwing out many times what is needed for the order 3
> > > watermark to be met.  It seems to be not as bad now, but look at these
> > > pages being reclaimed (200ms intervals, whitespace-packed buddyinfo
> > > followed by nr_pages_free calculation and final order-3 watermark test,
> > > kswapd woken after the second sample):
> > > 
> > > Normal zone at the same time (shown separately for clarity):
> > >
> > >   Zone order:0      1     2     3    4   5  6 7 8 9 A nr_free or3-low-chk
> > > 
> > > Normal     452      1     0     0    0   0  0 0 0 0 0     454 -5 <= 238
> > > Normal     452      1     0     0    0   0  0 0 0 0 0     454 -5 <= 238
> > > (kswapd wakes)
> > > Normal    7618     76     0     0    0   0  0 0 0 0 0    7770 145 <= 238
> > > Normal    8860     73     1     0    0   0  0 0 0 0 0    9010 143 <= 238
> > > Normal    8929     25     0     0    0   0  0 0 0 0 0    8979 43 <= 238
> > > Normal    8917      0     0     0    0   0  0 0 0 0 0    8917 -7 <= 238
> > > Normal    8978     16     0     0    0   0  0 0 0 0 0    9010 25 <= 238
> > > Normal    9064      4     0     0    0   0  0 0 0 0 0    9072 1 <= 238
> > > Normal    9068      2     0     0    0   0  0 0 0 0 0    9072 -3 <= 238
> > > Normal    8992      9     0     0    0   0  0 0 0 0 0    9010 11 <= 238
> > > Normal    9060      6     0     0    0   0  0 0 0 0 0    9072 5 <= 238
> > > Normal    9010      0     0     0    0   0  0 0 0 0 0    9010 -7 <= 238
> > > Normal    8907      5     0     0    0   0  0 0 0 0 0    8917 3 <= 238
> > > Normal    8576      0     0     0    0   0  0 0 0 0 0    8576 -7 <= 238
> > > Normal    8018      0     0     0    0   0  0 0 0 0 0    8018 -7 <= 238
> > > Normal    6778      0     0     0    0   0  0 0 0 0 0    6778 -7 <= 238
> > > Normal    6189      0     0     0    0   0  0 0 0 0 0    6189 -7 <= 238
> > > Normal    6220      0     0     0    0   0  0 0 0 0 0    6220 -7 <= 238
> > > Normal    6096      0     0     0    0   0  0 0 0 0 0    6096 -7 <= 238
> > > Normal    6251      0     0     0    0   0  0 0 0 0 0    6251 -7 <= 238
> > > Normal    6127      0     0     0    0   0  0 0 0 0 0    6127 -7 <= 238
> > > Normal    6218      1     0     0    0   0  0 0 0 0 0    6220 -5 <= 238
> > > Normal    6034      0     0     0    0   0  0 0 0 0 0    6034 -7 <= 238
> > > Normal    6065      0     0     0    0   0  0 0 0 0 0    6065 -7 <= 238
> > > Normal    6189      0     0     0    0   0  0 0 0 0 0    6189 -7 <= 238
> > > Normal    6189      0     0     0    0   0  0 0 0 0 0    6189 -7 <= 238
> > > Normal    6096      0     0     0    0   0  0 0 0 0 0    6096 -7 <= 238
> > > Normal    6127      0     0     0    0   0  0 0 0 0 0    6127 -7 <= 238
> > > Normal    6158      0     0     0    0   0  0 0 0 0 0    6158 -7 <= 238
> > > Normal    6127      0     0     0    0   0  0 0 0 0 0    6127 -7 <= 238
> > > (kswapd sleeps -- maybe too much turkey)
> > > 
> > > DMA32 get so much reclaimed that the watermark test succeeded long ago.
> > > Meanwhile, Normal is being reclaimed as well, but because it's fighting
> > > with allocations, it tries for a while and eventually succeeds (I think),
> > > but the 200ms samples didn't catch it.
> > > 
> > > KOSAKI Motohiro, I'm interested in your commit 73ce02e9.  This seems
> > > to be similar to this problem, but your change is not working here. 
> > > We're seeing kswapd run without sleeping, KSWAPD_SKIP_CONGESTION_WAIT
> > > is increasing (so has_under_min_watermark_zone is true), and pageoutrun
> > > increasing all the time.  This means that balance_pgdat() keeps being
> > > called, but sleeping_prematurely() is returning true, so kswapd() just
> > > keeps re-calling balance_pgdat().  If your approach is correct to stop
> > > kswapd here, the problem seems to be that balance_pgdat's copy of order
> > > and sc.order is being set to 0, but not pgdat->kswapd_max_order, so
> > > kswapd never really sleeps.  How is this supposed to work?
> > 
> > Um. this seems regression since commit f50de2d381 (vmscan: have kswapd sleep 
> > for a short interval and double check it should be asleep)
> > 
> 
> I wrote my own patch before I saw this but for one of the issues we are doing
> something similar. You are checking if enough pages got reclaimed where as
> my patch considers any zone being balanced for high-orders being sufficient
> for kswapd to go to sleep. I think mine is a little stronger because
> it's checking what state the zones are in for a given order regardless
> of what has been reclaimed. Lets see what testing has to say.
record the order seems not sufficient. in balance_pgdat(), the for look
exit only when:
priority <0 or sc.nr_reclaimed >= SWAP_CLUSTER_MAX.
but we do if (sc.nr_reclaimed < SWAP_CLUSTER_MAX)
                        order = sc.order = 0;
this means before we set order to 0, we already reclaimed a lot of
pages, so I thought we need set order to 0 earlier before there are
enough free pages. below is a debug patch.


diff --git a/mm/vmscan.c b/mm/vmscan.c
index d31d7ce..ee5d2ed 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2117,6 +2117,26 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
 }
 #endif
 
+static int all_zone_enough_free_pages(pg_data_t *pgdat)
+{
+	int i;
+
+	for (i = 0; i < pgdat->nr_zones; i++) {
+		struct zone *zone = pgdat->node_zones + i;
+
+		if (!populated_zone(zone))
+			continue;
+
+		if (zone->all_unreclaimable)
+			continue;
+
+		if (!zone_watermark_ok(zone, 0, high_wmark_pages(zone) * 8,
+								0, 0))
+			return 0;
+	}
+	return 1;
+}
+
 /* is kswapd sleeping prematurely? */
 static int sleeping_prematurely(pg_data_t *pgdat, int order, long remaining)
 {
@@ -2355,7 +2375,8 @@ out:
 		 * back to sleep. High-order users can still perform direct
 		 * reclaim if they wish.
 		 */
-		if (sc.nr_reclaimed < SWAP_CLUSTER_MAX)
+		if (sc.nr_reclaimed < SWAP_CLUSTER_MAX ||
+		    (order > 0 && all_zone_enough_free_pages(pgdat)))
 			order = sc.order = 0;
 
 		goto loop_again;



  reply	other threads:[~2010-11-26  2:00 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-15 19:52 Free memory never fully used, swapping Simon Kirby
2010-11-22 23:44 ` Andrew Morton
2010-11-23  1:34   ` Simon Kirby
2010-11-23  8:35   ` Dave Hansen
2010-11-24  8:46     ` Simon Kirby
2010-11-25  1:07       ` Shaohua Li
2010-11-25  9:03         ` Simon Kirby
2010-11-25 10:18           ` KOSAKI Motohiro
2010-11-25 17:13             ` Simon Kirby
2010-11-26  0:33               ` KOSAKI Motohiro
2010-11-25 10:51           ` KOSAKI Motohiro
2010-11-25 16:15             ` Mel Gorman
2010-11-26  2:00               ` Shaohua Li [this message]
2010-11-26  2:31                 ` KOSAKI Motohiro
2010-11-26  2:40                   ` Shaohua Li
2010-11-26  9:18                     ` KOSAKI Motohiro
2010-11-29  1:03                       ` Shaohua Li
2010-11-29  1:13                         ` KOSAKI Motohiro
2010-11-26  0:07             ` KOSAKI Motohiro
2010-11-25 16:12           ` Mel Gorman
2010-11-26  1:05             ` Shaohua Li
2010-11-26  1:25               ` Mel Gorman
2010-11-26  2:05                 ` Shaohua Li
2010-11-26 11:03             ` KOSAKI Motohiro
2010-11-26 11:11               ` Mel Gorman
2010-11-30  6:31                 ` KOSAKI Motohiro
2010-11-30 10:41                   ` Mel Gorman
2010-11-30 11:19                     ` KOSAKI Motohiro
2010-11-30  8:22             ` Simon Kirby
2010-11-29  9:31       ` KOSAKI Motohiro
2010-11-23 10:04   ` Mel Gorman
2010-11-24  6:43     ` Simon Kirby
2010-11-24  9:27       ` Mel Gorman
2010-11-24 19:17         ` Simon Kirby
2010-11-25  1:18           ` KOSAKI Motohiro
2010-11-26 15:48             ` Christoph Lameter
2010-11-30  0:25               ` KOSAKI Motohiro
2010-11-30 19:10                 ` Christoph Lameter
2010-12-01 10:17                   ` KOSAKI Motohiro
2010-12-01 15:29                     ` Christoph Lameter
2010-12-02  2:44                       ` KOSAKI Motohiro
2010-12-02 14:39                         ` Christoph Lameter
2010-11-30  9:13               ` Simon Kirby
2010-11-30 19:13                 ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1290736844.12777.10.camel@sli10-conroe \
    --to=shaohua.li@intel.com \
    --cc=dave@linux.vnet.ibm.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=sim@hostway.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).