From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757565Ab0JSCwt (ORCPT ); Mon, 18 Oct 2010 22:52:49 -0400 Received: from mail-yx0-f174.google.com ([209.85.213.174]:47317 "EHLO mail-yx0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754272Ab0JSCws convert rfc822-to-8bit (ORCPT ); Mon, 18 Oct 2010 22:52:48 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=fPpNKVJQRDt+/quikq2SwEXLV3AG2D7g4aiP6RDDrWHA5mA1alaGlQY+cOzDJhyh3M ghRmuZqvtHuvuKwD3MSwIz6aycueSFwPthWhbMrGy27PUkJuUZL+oDGZjGWQUR/nY28E kXdHGPlNMIsUVVT9riZbghGmX2xQcdCOXL3OU= MIME-Version: 1.0 In-Reply-To: <20101019023537.GB8310@localhost> References: <20101019093142.509d6947@notabene> <20101018154137.90f5325f.akpm@linux-foundation.org> <20101019095144.A1B0.A69D9226@jp.fujitsu.com> <20101019023537.GB8310@localhost> Date: Tue, 19 Oct 2010 11:52:47 +0900 Message-ID: Subject: Re: Deadlock possibly caused by too_many_isolated. From: Minchan Kim To: Wu Fengguang Cc: KOSAKI Motohiro , Andrew Morton , Neil Brown , Rik van Riel , KAMEZAWA Hiroyuki , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Li, Shaohua" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Wu, On Tue, Oct 19, 2010 at 11:35 AM, Wu Fengguang wrote: >> @@ -2054,10 +2069,11 @@ rebalance: >>                 goto got_pg; >> >>         /* >> -        * If we failed to make any progress reclaiming, then we are >> -        * running out of options and have to consider going OOM >> +        * If we failed to make any progress reclaiming and there aren't >> +        * many parallel reclaiming, then we are unning out of options and >> +        * have to consider going OOM >>          */ >> -       if (!did_some_progress) { >> +       if (!did_some_progress && !too_many_isolated_zone(preferred_zone)) { >>                 if ((gfp_mask & __GFP_FS) && !(gfp_mask & __GFP_NORETRY)) { >>                         if (oom_killer_disabled) >>                                 goto nopage; > > This is simply wrong. > > It disabled this block for 99% system because there won't be enough > tasks to make (!too_many_isolated_zone == true). As a result the LRU > will be scanned like mad and no task get OOMed when it should be. If !too_many_isolated_zone is false, it means there are already many direct reclaiming tasks. So they could exit reclaim path and !too_many_isolated_zone will be true. What am I missing now? > Thanks, > Fengguang > -- Kind regards, Minchan Kim From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with SMTP id 765AF6B00CE for ; Mon, 18 Oct 2010 22:52:49 -0400 (EDT) Received: by iwn1 with SMTP id 1so2058332iwn.14 for ; Mon, 18 Oct 2010 19:52:47 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20101019023537.GB8310@localhost> References: <20101019093142.509d6947@notabene> <20101018154137.90f5325f.akpm@linux-foundation.org> <20101019095144.A1B0.A69D9226@jp.fujitsu.com> <20101019023537.GB8310@localhost> Date: Tue, 19 Oct 2010 11:52:47 +0900 Message-ID: Subject: Re: Deadlock possibly caused by too_many_isolated. From: Minchan Kim Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org To: Wu Fengguang Cc: KOSAKI Motohiro , Andrew Morton , Neil Brown , Rik van Riel , KAMEZAWA Hiroyuki , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Li, Shaohua" List-ID: Hi Wu, On Tue, Oct 19, 2010 at 11:35 AM, Wu Fengguang wro= te: >> @@ -2054,10 +2069,11 @@ rebalance: >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto got_pg; >> >> =A0 =A0 =A0 =A0 /* >> - =A0 =A0 =A0 =A0* If we failed to make any progress reclaiming, then we= are >> - =A0 =A0 =A0 =A0* running out of options and have to consider going OOM >> + =A0 =A0 =A0 =A0* If we failed to make any progress reclaiming and ther= e aren't >> + =A0 =A0 =A0 =A0* many parallel reclaiming, then we are unning out of o= ptions and >> + =A0 =A0 =A0 =A0* have to consider going OOM >> =A0 =A0 =A0 =A0 =A0*/ >> - =A0 =A0 =A0 if (!did_some_progress) { >> + =A0 =A0 =A0 if (!did_some_progress && !too_many_isolated_zone(preferre= d_zone)) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if ((gfp_mask & __GFP_FS) && !(gfp_mask = & __GFP_NORETRY)) { >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (oom_killer_disabled) >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 goto nop= age; > > This is simply wrong. > > It disabled this block for 99% system because there won't be enough > tasks to make (!too_many_isolated_zone =3D=3D true). As a result the LRU > will be scanned like mad and no task get OOMed when it should be. If !too_many_isolated_zone is false, it means there are already many direct reclaiming tasks. So they could exit reclaim path and !too_many_isolated_zone will be true. What am I missing now? > Thanks, > Fengguang > --=20 Kind regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org