All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neil Brown <neilb@suse.de>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: Deadlock possibly caused by too_many_isolated.
Date: Wed, 15 Sep 2010 13:17:35 +1000	[thread overview]
Message-ID: <20100915131735.08899288@notabene> (raw)
In-Reply-To: <20100915030640.GA11141@localhost>

On Wed, 15 Sep 2010 11:06:40 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Wed, Sep 15, 2010 at 10:54:54AM +0800, Wu Fengguang wrote:
> > On Wed, Sep 15, 2010 at 10:37:35AM +0800, Wu Fengguang wrote:
> > > On Wed, Sep 15, 2010 at 10:23:34AM +0800, Neil Brown wrote:
> > > > On Tue, 14 Sep 2010 20:30:18 -0400
> > > > Rik van Riel <riel@redhat.com> wrote:
> > > > 
> > > > > On 09/14/2010 07:11 PM, Neil Brown wrote:
> > > > > 
> > > > > > Index: linux-2.6.32-SLE11-SP1/mm/vmscan.c
> > > > > > ===================================================================
> > > > > > --- linux-2.6.32-SLE11-SP1.orig/mm/vmscan.c	2010-09-15 08:37:32.000000000 +1000
> > > > > > +++ linux-2.6.32-SLE11-SP1/mm/vmscan.c	2010-09-15 08:38:57.000000000 +1000
> > > > > > @@ -1106,6 +1106,11 @@ static unsigned long shrink_inactive_lis
> > > > > >   		/* We are about to die and free our memory. Return now. */
> > > > > >   		if (fatal_signal_pending(current))
> > > > > >   			return SWAP_CLUSTER_MAX;
> > > > > > +		if (!(sc->gfp_mask&  __GFP_IO))
> > > > > > +			/* Not allowed to do IO, so mustn't wait
> > > > > > +			 * on processes that might try to
> > > > > > +			 */
> > > > > > +			return SWAP_CLUSTER_MAX;
> > > > > >   	}
> > > > > >
> > > > > >   	/*
> > > > > 
> > > > > Close.  We must also be sure that processes without __GFP_FS
> > > > > set in their gfp_mask do not wait on processes that do have
> > > > > __GFP_FS set.
> > > > > 
> > > > > Considering how many times we've run into a bug like this,
> > > > > I'm kicking myself for not having thought of it :(
> > > > > 
> > > > 
> > > > So maybe this?  I've added the test for __GFP_FS, and moved the test before
> > > > the congestion_wait on the basis that we really want to get back up the stack
> > > > and try the mempool ASAP.
> > > 
> > > The patch may well fail the !__GFP_IO page allocation and then
> > > quickly exhaust the mempool.
> > > 
> > > Another approach may to let too_many_isolated() use much higher
> > > thresholds for !__GFP_IO/FS and lower ones for __GFP_IO/FS. ie. to
> > > allow at least nr2 NOIO/FS tasks to be blocked independent of the
> > > IO/FS ones.  Since NOIO vmscans typically completes fast, it will then
> > > very hard to accumulate enough NOIO processes to be actually blocked.
> > > 
> > > 
> > >                   IO/FS tasks                NOIO/FS tasks           full
> > >                   block here                 block here              LRU size
> > > |-----------------|--------------------------|-----------------------|
> > > |      nr1        |           nr2            |
> > 
> > How about this fix? We may need very high threshold for NOIO/NOFS to
> > prevent possible regressions.
> 
> Plus __GFP_WAIT..
> 
> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 225a759..6a896eb 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1135,6 +1135,7 @@ static int too_many_isolated(struct zone *zone, int file,
>  		struct scan_control *sc)
>  {
>  	unsigned long inactive, isolated;
> +	int ratio;
>  
>  	if (current_is_kswapd())
>  		return 0;
> @@ -1150,7 +1151,15 @@ static int too_many_isolated(struct zone *zone, int file,
>  		isolated = zone_page_state(zone, NR_ISOLATED_ANON);
>  	}
>  
> -	return isolated > inactive;
> +	ratio = 1;
> +	if (!(sc->gfp_mask & (__GFP_FS)))
> +		ratio <<= 1;
> +	if (!(sc->gfp_mask & (__GFP_IO)))
> +		ratio <<= 1;
> +	if (!(sc->gfp_mask & (__GFP_WAIT)))
> +		ratio <<= 1;
> +
> +	return isolated > inactive * ratio;
>  }
>  
>  /*


Are you suggesting this instead of my patch, or as well as my patch?

Because while I think it sounds like a good idea I don't think it actually
removes the chance of a deadlock, just makes it a lot less likely.
So I think your patch combined with my patch would be a good total solution.

Do you agree?

Thanks,
NeilBrown


WARNING: multiple messages have this Message-ID (diff)
From: Neil Brown <neilb@suse.de>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Rik van Riel <riel@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: Deadlock possibly caused by too_many_isolated.
Date: Wed, 15 Sep 2010 13:17:35 +1000	[thread overview]
Message-ID: <20100915131735.08899288@notabene> (raw)
In-Reply-To: <20100915030640.GA11141@localhost>

On Wed, 15 Sep 2010 11:06:40 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Wed, Sep 15, 2010 at 10:54:54AM +0800, Wu Fengguang wrote:
> > On Wed, Sep 15, 2010 at 10:37:35AM +0800, Wu Fengguang wrote:
> > > On Wed, Sep 15, 2010 at 10:23:34AM +0800, Neil Brown wrote:
> > > > On Tue, 14 Sep 2010 20:30:18 -0400
> > > > Rik van Riel <riel@redhat.com> wrote:
> > > > 
> > > > > On 09/14/2010 07:11 PM, Neil Brown wrote:
> > > > > 
> > > > > > Index: linux-2.6.32-SLE11-SP1/mm/vmscan.c
> > > > > > ===================================================================
> > > > > > --- linux-2.6.32-SLE11-SP1.orig/mm/vmscan.c	2010-09-15 08:37:32.000000000 +1000
> > > > > > +++ linux-2.6.32-SLE11-SP1/mm/vmscan.c	2010-09-15 08:38:57.000000000 +1000
> > > > > > @@ -1106,6 +1106,11 @@ static unsigned long shrink_inactive_lis
> > > > > >   		/* We are about to die and free our memory. Return now. */
> > > > > >   		if (fatal_signal_pending(current))
> > > > > >   			return SWAP_CLUSTER_MAX;
> > > > > > +		if (!(sc->gfp_mask&  __GFP_IO))
> > > > > > +			/* Not allowed to do IO, so mustn't wait
> > > > > > +			 * on processes that might try to
> > > > > > +			 */
> > > > > > +			return SWAP_CLUSTER_MAX;
> > > > > >   	}
> > > > > >
> > > > > >   	/*
> > > > > 
> > > > > Close.  We must also be sure that processes without __GFP_FS
> > > > > set in their gfp_mask do not wait on processes that do have
> > > > > __GFP_FS set.
> > > > > 
> > > > > Considering how many times we've run into a bug like this,
> > > > > I'm kicking myself for not having thought of it :(
> > > > > 
> > > > 
> > > > So maybe this?  I've added the test for __GFP_FS, and moved the test before
> > > > the congestion_wait on the basis that we really want to get back up the stack
> > > > and try the mempool ASAP.
> > > 
> > > The patch may well fail the !__GFP_IO page allocation and then
> > > quickly exhaust the mempool.
> > > 
> > > Another approach may to let too_many_isolated() use much higher
> > > thresholds for !__GFP_IO/FS and lower ones for __GFP_IO/FS. ie. to
> > > allow at least nr2 NOIO/FS tasks to be blocked independent of the
> > > IO/FS ones.  Since NOIO vmscans typically completes fast, it will then
> > > very hard to accumulate enough NOIO processes to be actually blocked.
> > > 
> > > 
> > >                   IO/FS tasks                NOIO/FS tasks           full
> > >                   block here                 block here              LRU size
> > > |-----------------|--------------------------|-----------------------|
> > > |      nr1        |           nr2            |
> > 
> > How about this fix? We may need very high threshold for NOIO/NOFS to
> > prevent possible regressions.
> 
> Plus __GFP_WAIT..
> 
> ---
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 225a759..6a896eb 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1135,6 +1135,7 @@ static int too_many_isolated(struct zone *zone, int file,
>  		struct scan_control *sc)
>  {
>  	unsigned long inactive, isolated;
> +	int ratio;
>  
>  	if (current_is_kswapd())
>  		return 0;
> @@ -1150,7 +1151,15 @@ static int too_many_isolated(struct zone *zone, int file,
>  		isolated = zone_page_state(zone, NR_ISOLATED_ANON);
>  	}
>  
> -	return isolated > inactive;
> +	ratio = 1;
> +	if (!(sc->gfp_mask & (__GFP_FS)))
> +		ratio <<= 1;
> +	if (!(sc->gfp_mask & (__GFP_IO)))
> +		ratio <<= 1;
> +	if (!(sc->gfp_mask & (__GFP_WAIT)))
> +		ratio <<= 1;
> +
> +	return isolated > inactive * ratio;
>  }
>  
>  /*


Are you suggesting this instead of my patch, or as well as my patch?

Because while I think it sounds like a good idea I don't think it actually
removes the chance of a deadlock, just makes it a lot less likely.
So I think your patch combined with my patch would be a good total solution.

Do you agree?

Thanks,
NeilBrown

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-09-15  3:17 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-14 23:11 Deadlock possibly caused by too_many_isolated Neil Brown
2010-09-14 23:11 ` Neil Brown
2010-09-15  0:30 ` Rik van Riel
2010-09-15  0:30   ` Rik van Riel
2010-09-15  2:23   ` Neil Brown
2010-09-15  2:23     ` Neil Brown
2010-09-15  2:37     ` Wu Fengguang
2010-09-15  2:37       ` Wu Fengguang
2010-09-15  2:54       ` Wu Fengguang
2010-09-15  2:54         ` Wu Fengguang
2010-09-15  3:06         ` Wu Fengguang
2010-09-15  3:06           ` Wu Fengguang
2010-09-15  3:13           ` Wu Fengguang
2010-09-15  3:13             ` Wu Fengguang
2010-09-15  3:18             ` Shaohua Li
2010-09-15  3:18               ` Shaohua Li
2010-09-15  3:31               ` Wu Fengguang
2010-09-15  3:31                 ` Wu Fengguang
2010-09-15  3:17           ` Neil Brown [this message]
2010-09-15  3:17             ` Neil Brown
2010-09-15  3:47             ` Wu Fengguang
2010-09-15  3:47               ` Wu Fengguang
2010-09-15  8:28     ` Wu Fengguang
2010-09-15  8:28       ` Wu Fengguang
2010-09-15  8:44       ` Neil Brown
2010-09-15  8:44         ` Neil Brown
2010-10-18  4:14         ` Neil Brown
2010-10-18  4:14           ` Neil Brown
2010-10-18  5:04           ` KOSAKI Motohiro
2010-10-18  5:04             ` KOSAKI Motohiro
2010-10-18 10:58           ` Torsten Kaiser
2010-10-18 10:58             ` Torsten Kaiser
2010-10-18 23:11             ` Neil Brown
2010-10-18 23:11               ` Neil Brown
2010-10-19  8:43               ` Torsten Kaiser
2010-10-19  8:43                 ` Torsten Kaiser
2010-10-19 10:06                 ` Torsten Kaiser
2010-10-19 10:06                   ` Torsten Kaiser
2010-10-20  5:57                   ` Wu Fengguang
2010-10-20  5:57                     ` Wu Fengguang
2010-10-20  7:05                     ` KOSAKI Motohiro
2010-10-20  7:05                       ` KOSAKI Motohiro
2010-10-20  9:27                       ` Wu Fengguang
2010-10-20  9:27                         ` Wu Fengguang
2010-10-20 13:03                         ` Jens Axboe
2010-10-20 13:03                           ` Jens Axboe
2010-10-22  5:37                           ` Wu Fengguang
2010-10-22  5:37                             ` Wu Fengguang
2010-10-22  8:07                             ` Wu Fengguang
2010-10-22  8:07                               ` Wu Fengguang
2010-10-22  8:09                               ` Jens Axboe
2010-10-22  8:09                                 ` Jens Axboe
2010-10-24 16:52                                 ` Wu Fengguang
2010-10-24 16:52                                   ` Wu Fengguang
2010-10-25  6:40                                   ` Neil Brown
2010-10-25  6:40                                     ` Neil Brown
2010-10-25  7:26                                     ` Wu Fengguang
2010-10-25  7:26                                       ` Wu Fengguang
2010-10-20  7:25                     ` Torsten Kaiser
2010-10-20  7:25                       ` Torsten Kaiser
2010-10-20  9:01                       ` Wu Fengguang
2010-10-20  9:01                         ` Wu Fengguang
2010-10-20 10:07                         ` Torsten Kaiser
2010-10-20 10:07                           ` Torsten Kaiser
2010-10-20 14:23                       ` Minchan Kim
2010-10-20 14:23                         ` Minchan Kim
2010-10-20 15:35                         ` Torsten Kaiser
2010-10-20 15:35                           ` Torsten Kaiser
2010-10-20 23:31                           ` Minchan Kim
2010-10-20 23:31                             ` Minchan Kim
2010-10-18 16:15           ` Wu Fengguang
2010-10-18 16:15             ` Wu Fengguang
2010-10-18 21:58             ` Andrew Morton
2010-10-18 21:58               ` Andrew Morton
2010-10-18 22:31               ` Neil Brown
2010-10-18 22:31                 ` Neil Brown
2010-10-18 22:41                 ` Andrew Morton
2010-10-18 22:41                   ` Andrew Morton
2010-10-19  0:57                   ` KOSAKI Motohiro
2010-10-19  0:57                     ` KOSAKI Motohiro
2010-10-19  1:15                     ` Minchan Kim
2010-10-19  1:15                       ` Minchan Kim
2010-10-19  1:21                       ` KOSAKI Motohiro
2010-10-19  1:21                         ` KOSAKI Motohiro
2010-10-19  1:32                         ` Minchan Kim
2010-10-19  1:32                           ` Minchan Kim
2010-10-19  2:03                           ` KOSAKI Motohiro
2010-10-19  2:03                             ` KOSAKI Motohiro
2010-10-19  2:16                             ` Minchan Kim
2010-10-19  2:16                               ` Minchan Kim
2010-10-19  2:54                               ` KOSAKI Motohiro
2010-10-19  2:54                                 ` KOSAKI Motohiro
2010-10-19  2:35                       ` Wu Fengguang
2010-10-19  2:35                         ` Wu Fengguang
2010-10-19  2:52                         ` Minchan Kim
2010-10-19  2:52                           ` Minchan Kim
2010-10-19  3:05                           ` Wu Fengguang
2010-10-19  3:05                             ` Wu Fengguang
2010-10-19  3:09                             ` Minchan Kim
2010-10-19  3:09                               ` Minchan Kim
2010-10-19  3:13                               ` KOSAKI Motohiro
2010-10-19  3:13                                 ` KOSAKI Motohiro
2010-10-19  5:11                                 ` Minchan Kim
2010-10-19  5:11                                   ` Minchan Kim
2010-10-19  3:21                               ` Shaohua Li
2010-10-19  3:21                                 ` Shaohua Li
2010-10-19  7:15                                 ` Shaohua Li
2010-10-19  7:15                                   ` Shaohua Li
2010-10-19  7:34                                   ` Minchan Kim
2010-10-19  7:34                                     ` Minchan Kim
2010-10-19  2:24                   ` Wu Fengguang
2010-10-19  2:24                     ` Wu Fengguang
2010-10-19  2:37                     ` KOSAKI Motohiro
2010-10-19  2:37                       ` KOSAKI Motohiro
2010-10-19  2:37                     ` Minchan Kim
2010-10-19  2:37                       ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100915131735.08899288@notabene \
    --to=neilb@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.