All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	akpm@linux-foundation.org, colin.king@canonical.com,
	raghu.prabhu13@gmail.com, jack@suse.cz, chris.mason@oracle.com,
	cl@linux.com, penberg@kernel.org, riel@redhat.com,
	hannes@cmpxchg.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep
Date: Mon, 16 May 2011 09:45:58 +0100	[thread overview]
Message-ID: <20110516084558.GE5279@suse.de> (raw)
In-Reply-To: <BANLkTi=oe4Ties6awwhHFPf42EXCn2U4MQ@mail.gmail.com>

On Mon, May 16, 2011 at 02:04:00PM +0900, Minchan Kim wrote:
> On Mon, May 16, 2011 at 1:21 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Sun, 2011-05-15 at 19:27 +0900, KOSAKI Motohiro wrote:
> >> (2011/05/13 23:03), Mel Gorman wrote:
> >> > Under constant allocation pressure, kswapd can be in the situation where
> >> > sleeping_prematurely() will always return true even if kswapd has been
> >> > running a long time. Check if kswapd needs to be scheduled.
> >> >
> >> > Signed-off-by: Mel Gorman<mgorman@suse.de>
> >> > ---
> >> >   mm/vmscan.c |    4 ++++
> >> >   1 files changed, 4 insertions(+), 0 deletions(-)
> >> >
> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> > index af24d1e..4d24828 100644
> >> > --- a/mm/vmscan.c
> >> > +++ b/mm/vmscan.c
> >> > @@ -2251,6 +2251,10 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
> >> >     unsigned long balanced = 0;
> >> >     bool all_zones_ok = true;
> >> >
> >> > +   /* If kswapd has been running too long, just sleep */
> >> > +   if (need_resched())
> >> > +           return false;
> >> > +
> >>
> >> Hmm... I don't like this patch so much. because this code does
> >>
> >> - don't sleep if kswapd got context switch at shrink_inactive_list
> >
> > This isn't entirely true:  need_resched() will be false, so we'll follow
> > the normal path for determining whether to sleep or not, in effect
> > leaving the current behaviour unchanged.
> >
> >> - sleep if kswapd didn't
> >
> > This also isn't entirely true: whether need_resched() is true at this
> > point depends on a whole lot more that whether we did a context switch
> > in shrink_inactive. It mostly depends on how long we've been running
> > without giving up the CPU.  Generally that will mean we've been round
> > the shrinker loop hundreds to thousands of times without sleeping.
> >
> >> It seems to be semi random behavior.
> >
> > Well, we have to do something.  Chris Mason first suspected the hang was
> > a kswapd rescheduling problem a while ago.  We tried putting
> > cond_rescheds() in several places in the vmscan code, but to no avail.
> 
> Is it a result of  test with patch of Hannes(ie, !pgdat_balanced)?
> 
> If it isn't, it would be nop regardless of putting cond_reshed at vmscan.c.
> Because, although we complete zone balancing, kswapd doesn't sleep as
> pgdat_balance returns wrong result. And at last VM calls
> balance_pgdat. In this case, balance_pgdat returns without any work as
> kswap couldn't find zones which have not enough free pages and goto
> out. kswapd could repeat this work infinitely. So you don't have a
> chance to call cond_resched.
> 
> But if your test was with Hanne's patch, I am very curious how come
> kswapd consumes CPU a lot.
> 
> > The need_resched() in sleeping_prematurely() seems to be about the best
> > option.  The other option might be just to put a cond_resched() in
> > kswapd_try_to_sleep(), but that will really have about the same effect.
> 
> I don't oppose it but before that, I think we have to know why kswapd
> consumes CPU a lot although we applied Hannes' patch.
> 

Because it's still possible for processes to allocate pages at the same
rate kswapd is freeing them leading to a situation where kswapd does not
consider the zone balanced for prolonged periods of time.

-- 
Mel Gorman
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	akpm@linux-foundation.org, colin.king@canonical.com,
	raghu.prabhu13@gmail.com, jack@suse.cz, chris.mason@oracle.com,
	cl@linux.com, penberg@kernel.org, riel@redhat.com,
	hannes@cmpxchg.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep
Date: Mon, 16 May 2011 09:45:58 +0100	[thread overview]
Message-ID: <20110516084558.GE5279@suse.de> (raw)
In-Reply-To: <BANLkTi=oe4Ties6awwhHFPf42EXCn2U4MQ@mail.gmail.com>

On Mon, May 16, 2011 at 02:04:00PM +0900, Minchan Kim wrote:
> On Mon, May 16, 2011 at 1:21 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Sun, 2011-05-15 at 19:27 +0900, KOSAKI Motohiro wrote:
> >> (2011/05/13 23:03), Mel Gorman wrote:
> >> > Under constant allocation pressure, kswapd can be in the situation where
> >> > sleeping_prematurely() will always return true even if kswapd has been
> >> > running a long time. Check if kswapd needs to be scheduled.
> >> >
> >> > Signed-off-by: Mel Gorman<mgorman@suse.de>
> >> > ---
> >> >   mm/vmscan.c |    4 ++++
> >> >   1 files changed, 4 insertions(+), 0 deletions(-)
> >> >
> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> > index af24d1e..4d24828 100644
> >> > --- a/mm/vmscan.c
> >> > +++ b/mm/vmscan.c
> >> > @@ -2251,6 +2251,10 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
> >> >     unsigned long balanced = 0;
> >> >     bool all_zones_ok = true;
> >> >
> >> > +   /* If kswapd has been running too long, just sleep */
> >> > +   if (need_resched())
> >> > +           return false;
> >> > +
> >>
> >> Hmm... I don't like this patch so much. because this code does
> >>
> >> - don't sleep if kswapd got context switch at shrink_inactive_list
> >
> > This isn't entirely true:  need_resched() will be false, so we'll follow
> > the normal path for determining whether to sleep or not, in effect
> > leaving the current behaviour unchanged.
> >
> >> - sleep if kswapd didn't
> >
> > This also isn't entirely true: whether need_resched() is true at this
> > point depends on a whole lot more that whether we did a context switch
> > in shrink_inactive. It mostly depends on how long we've been running
> > without giving up the CPU.  Generally that will mean we've been round
> > the shrinker loop hundreds to thousands of times without sleeping.
> >
> >> It seems to be semi random behavior.
> >
> > Well, we have to do something.  Chris Mason first suspected the hang was
> > a kswapd rescheduling problem a while ago.  We tried putting
> > cond_rescheds() in several places in the vmscan code, but to no avail.
> 
> Is it a result of  test with patch of Hannes(ie, !pgdat_balanced)?
> 
> If it isn't, it would be nop regardless of putting cond_reshed at vmscan.c.
> Because, although we complete zone balancing, kswapd doesn't sleep as
> pgdat_balance returns wrong result. And at last VM calls
> balance_pgdat. In this case, balance_pgdat returns without any work as
> kswap couldn't find zones which have not enough free pages and goto
> out. kswapd could repeat this work infinitely. So you don't have a
> chance to call cond_resched.
> 
> But if your test was with Hanne's patch, I am very curious how come
> kswapd consumes CPU a lot.
> 
> > The need_resched() in sleeping_prematurely() seems to be about the best
> > option.  The other option might be just to put a cond_resched() in
> > kswapd_try_to_sleep(), but that will really have about the same effect.
> 
> I don't oppose it but before that, I think we have to know why kswapd
> consumes CPU a lot although we applied Hannes' patch.
> 

Because it's still possible for processes to allocate pages at the same
rate kswapd is freeing them leading to a situation where kswapd does not
consider the zone balanced for prolonged periods of time.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Minchan Kim <minchan.kim@gmail.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	akpm@linux-foundation.org, colin.king@canonical.com,
	raghu.prabhu13@gmail.com, jack@suse.cz, chris.mason@oracle.com,
	cl@linux.com, penberg@kernel.org, riel@redhat.com,
	hannes@cmpxchg.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-ext4@vger.kernel.org
Subject: Re: [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep
Date: Mon, 16 May 2011 09:45:58 +0100	[thread overview]
Message-ID: <20110516084558.GE5279@suse.de> (raw)
In-Reply-To: <BANLkTi=oe4Ties6awwhHFPf42EXCn2U4MQ@mail.gmail.com>

On Mon, May 16, 2011 at 02:04:00PM +0900, Minchan Kim wrote:
> On Mon, May 16, 2011 at 1:21 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Sun, 2011-05-15 at 19:27 +0900, KOSAKI Motohiro wrote:
> >> (2011/05/13 23:03), Mel Gorman wrote:
> >> > Under constant allocation pressure, kswapd can be in the situation where
> >> > sleeping_prematurely() will always return true even if kswapd has been
> >> > running a long time. Check if kswapd needs to be scheduled.
> >> >
> >> > Signed-off-by: Mel Gorman<mgorman@suse.de>
> >> > ---
> >> >   mm/vmscan.c |    4 ++++
> >> >   1 files changed, 4 insertions(+), 0 deletions(-)
> >> >
> >> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> >> > index af24d1e..4d24828 100644
> >> > --- a/mm/vmscan.c
> >> > +++ b/mm/vmscan.c
> >> > @@ -2251,6 +2251,10 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
> >> >     unsigned long balanced = 0;
> >> >     bool all_zones_ok = true;
> >> >
> >> > +   /* If kswapd has been running too long, just sleep */
> >> > +   if (need_resched())
> >> > +           return false;
> >> > +
> >>
> >> Hmm... I don't like this patch so much. because this code does
> >>
> >> - don't sleep if kswapd got context switch at shrink_inactive_list
> >
> > This isn't entirely true:  need_resched() will be false, so we'll follow
> > the normal path for determining whether to sleep or not, in effect
> > leaving the current behaviour unchanged.
> >
> >> - sleep if kswapd didn't
> >
> > This also isn't entirely true: whether need_resched() is true at this
> > point depends on a whole lot more that whether we did a context switch
> > in shrink_inactive. It mostly depends on how long we've been running
> > without giving up the CPU.  Generally that will mean we've been round
> > the shrinker loop hundreds to thousands of times without sleeping.
> >
> >> It seems to be semi random behavior.
> >
> > Well, we have to do something.  Chris Mason first suspected the hang was
> > a kswapd rescheduling problem a while ago.  We tried putting
> > cond_rescheds() in several places in the vmscan code, but to no avail.
> 
> Is it a result of  test with patch of Hannes(ie, !pgdat_balanced)?
> 
> If it isn't, it would be nop regardless of putting cond_reshed at vmscan.c.
> Because, although we complete zone balancing, kswapd doesn't sleep as
> pgdat_balance returns wrong result. And at last VM calls
> balance_pgdat. In this case, balance_pgdat returns without any work as
> kswap couldn't find zones which have not enough free pages and goto
> out. kswapd could repeat this work infinitely. So you don't have a
> chance to call cond_resched.
> 
> But if your test was with Hanne's patch, I am very curious how come
> kswapd consumes CPU a lot.
> 
> > The need_resched() in sleeping_prematurely() seems to be about the best
> > option.  The other option might be just to put a cond_resched() in
> > kswapd_try_to_sleep(), but that will really have about the same effect.
> 
> I don't oppose it but before that, I think we have to know why kswapd
> consumes CPU a lot although we applied Hannes' patch.
> 

Because it's still possible for processes to allocate pages at the same
rate kswapd is freeing them leading to a situation where kswapd does not
consider the zone balanced for prolonged periods of time.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-05-16  8:46 UTC|newest]

Thread overview: 119+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-13 14:03 [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 Mel Gorman
2011-05-13 14:03 ` Mel Gorman
2011-05-13 14:03 ` [PATCH 1/4] mm: vmscan: Correct use of pgdat_balanced in sleeping_prematurely Mel Gorman
2011-05-13 14:03   ` Mel Gorman
2011-05-13 14:28   ` Johannes Weiner
2011-05-13 14:28     ` Johannes Weiner
2011-05-14 16:30   ` Minchan Kim
2011-05-14 16:30     ` Minchan Kim
2011-05-16 14:30   ` Rik van Riel
2011-05-16 14:30     ` Rik van Riel
2011-05-13 14:03 ` [PATCH 2/4] mm: slub: Do not wake kswapd for SLUBs speculative high-order allocations Mel Gorman
2011-05-13 14:03   ` Mel Gorman
2011-05-16 21:10   ` David Rientjes
2011-05-16 21:10     ` David Rientjes
2011-05-18  6:09     ` Pekka Enberg
2011-05-18  6:09       ` Pekka Enberg
2011-05-18 17:21       ` Christoph Lameter
2011-05-18 17:21         ` Christoph Lameter
2011-05-13 14:03 ` [PATCH 3/4] mm: slub: Do not take expensive steps " Mel Gorman
2011-05-13 14:03   ` Mel Gorman
2011-05-16 21:16   ` David Rientjes
2011-05-16 21:16     ` David Rientjes
2011-05-17  8:42     ` Mel Gorman
2011-05-17  8:42       ` Mel Gorman
2011-05-17 13:51       ` Christoph Lameter
2011-05-17 13:51         ` Christoph Lameter
2011-05-17 16:22         ` Mel Gorman
2011-05-17 16:22           ` Mel Gorman
2011-05-17 17:52           ` Christoph Lameter
2011-05-17 17:52             ` Christoph Lameter
2011-05-17 19:35             ` David Rientjes
2011-05-17 19:35               ` David Rientjes
2011-05-17 19:31       ` David Rientjes
2011-05-17 19:31         ` David Rientjes
2011-05-13 14:03 ` [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep Mel Gorman
2011-05-13 14:03   ` Mel Gorman
2011-05-15 10:27   ` KOSAKI Motohiro
2011-05-15 10:27     ` KOSAKI Motohiro
2011-05-16  4:21     ` James Bottomley
2011-05-16  4:21       ` James Bottomley
2011-05-16  5:04       ` Minchan Kim
2011-05-16  5:04         ` Minchan Kim
2011-05-16  8:45         ` Mel Gorman [this message]
2011-05-16  8:45           ` Mel Gorman
2011-05-16  8:45           ` Mel Gorman
2011-05-16  8:58           ` Minchan Kim
2011-05-16  8:58             ` Minchan Kim
2011-05-16  8:58             ` Minchan Kim
2011-05-16 10:27             ` Mel Gorman
2011-05-16 10:27               ` Mel Gorman
2011-05-16 10:27               ` Mel Gorman
2011-05-16 23:50               ` Minchan Kim
2011-05-16 23:50                 ` Minchan Kim
2011-05-17  0:48                 ` Minchan Kim
2011-05-17  0:48                   ` Minchan Kim
2011-05-17  0:48                   ` Minchan Kim
2011-05-17 10:38                 ` Mel Gorman
2011-05-17 10:38                   ` Mel Gorman
2011-05-17 10:38                   ` Mel Gorman
2011-05-17 13:50                   ` Colin Ian King
2011-05-17 13:50                     ` Colin Ian King
2011-05-17 16:15                     ` [PATCH] mm: vmscan: Correctly check if reclaimer should schedule during shrink_slab Mel Gorman
2011-05-17 16:15                       ` Mel Gorman
2011-05-18  0:45                       ` KOSAKI Motohiro
2011-05-18  0:45                         ` KOSAKI Motohiro
2011-05-19  0:03                       ` Minchan Kim
2011-05-19  0:03                         ` Minchan Kim
2011-05-19  0:03                         ` Minchan Kim
2011-05-19  0:09                       ` Minchan Kim
2011-05-19  0:09                         ` Minchan Kim
2011-05-19  0:09                         ` Minchan Kim
2011-05-19 11:36                         ` Colin Ian King
2011-05-19 11:36                           ` Colin Ian King
2011-05-20  0:06                           ` Minchan Kim
2011-05-20  0:06                             ` Minchan Kim
2011-05-20  0:06                             ` Minchan Kim
2011-05-18  4:19                     ` [PATCH 4/4] mm: vmscan: If kswapd has been running too long, allow it to sleep Minchan Kim
2011-05-18  4:19                       ` Minchan Kim
2011-05-18  7:39                       ` Colin Ian King
2011-05-18  7:39                         ` Colin Ian King
2011-05-18  4:09                   ` James Bottomley
2011-05-18  4:09                     ` James Bottomley
2011-05-18  1:05                 ` KOSAKI Motohiro
2011-05-18  1:05                   ` KOSAKI Motohiro
2011-05-18  5:44                   ` Minchan Kim
2011-05-18  5:44                     ` Minchan Kim
2011-05-18  5:44                     ` Minchan Kim
2011-05-18  6:05                     ` KOSAKI Motohiro
2011-05-18  6:05                       ` KOSAKI Motohiro
2011-05-18  9:58                     ` Mel Gorman
2011-05-18  9:58                       ` Mel Gorman
2011-05-18  9:58                       ` Mel Gorman
2011-05-18 22:55                       ` Minchan Kim
2011-05-18 22:55                         ` Minchan Kim
2011-05-18 23:54                         ` KOSAKI Motohiro
2011-05-18 23:54                           ` KOSAKI Motohiro
2011-05-18  0:26               ` KOSAKI Motohiro
2011-05-18  0:26                 ` KOSAKI Motohiro
2011-05-18  9:57                 ` Mel Gorman
2011-05-18  9:57                   ` Mel Gorman
2011-05-16  8:45     ` Mel Gorman
2011-05-16  8:45       ` Mel Gorman
2011-05-16 14:30   ` Rik van Riel
2011-05-16 14:30     ` Rik van Riel
2011-05-13 15:19 ` [PATCH 0/4] Reduce impact to overall system of SLUB using high-order allocations V2 James Bottomley
2011-05-13 15:19   ` James Bottomley
2011-05-13 15:19   ` James Bottomley
2011-05-13 15:52   ` Mel Gorman
2011-05-13 15:52     ` Mel Gorman
2011-05-13 15:21 ` Christoph Lameter
2011-05-13 15:21   ` Christoph Lameter
2011-05-13 15:43   ` Mel Gorman
2011-05-13 15:43     ` Mel Gorman
2011-05-14  8:34 ` Colin Ian King
2011-05-14  8:34   ` Colin Ian King
2011-05-16  8:37   ` Mel Gorman
2011-05-16  8:37     ` Mel Gorman
2011-05-16 11:24     ` Colin Ian King
2011-05-16 11:24       ` Colin Ian King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110516084558.GE5279@suse.de \
    --to=mgorman@suse.de \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=cl@linux.com \
    --cc=colin.king@canonical.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=penberg@kernel.org \
    --cc=raghu.prabhu13@gmail.com \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.