From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932221Ab0JSBVl (ORCPT ); Mon, 18 Oct 2010 21:21:41 -0400 Received: from fgwmail7.fujitsu.co.jp ([192.51.44.37]:52228 "EHLO fgwmail7.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755116Ab0JSBVk convert rfc822-to-8bit (ORCPT ); Mon, 18 Oct 2010 21:21:40 -0400 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 From: KOSAKI Motohiro To: Minchan Kim Subject: Re: Deadlock possibly caused by too_many_isolated. Cc: kosaki.motohiro@jp.fujitsu.com, Andrew Morton , Neil Brown , Wu Fengguang , Rik van Riel , KAMEZAWA Hiroyuki , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Li, Shaohua" In-Reply-To: References: <20101019095144.A1B0.A69D9226@jp.fujitsu.com> Message-Id: <20101019102114.A1B9.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 8BIT X-Mailer: Becky! ver. 2.50.07 [ja] Date: Tue, 19 Oct 2010 10:21:32 +0900 (JST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Tue, Oct 19, 2010 at 9:57 AM, KOSAKI Motohiro > wrote: > >> > I think there are two bugs here. > >> > The raid1 bug that Torsten mentions is certainly real (and has been around > >> > for an embarrassingly long time). > >> > The bug that I identified in too_many_isolated is also a real bug and can be > >> > triggered without md/raid1 in the mix. > >> > So this is not a 'full fix' for every bug in the kernel :-), but it could > >> > well be a full fix for this particular bug. > >> > > >> > >> Can we just delete the too_many_isolated() logic?  (Crappy comment > >> describes what the code does but not why it does it). > > > > if my remember is correct, we got bug report that LTP may makes misterious > > OOM killer invocation about 1-2 years ago. because, if too many parocess are in > > reclaim path, all of reclaimable pages can be isolated and last reclaimer found > > the system don't have any reclaimable pages and lead to invoke OOM killer. > > We have strong motivation to avoid false positive oom. then, some discusstion > > made this patch. > > > > if my remember is incorrect, I hope Wu or Rik fix me. > > AFAIR, it's right. > > How about this? > > It's rather aggressive throttling than old(ie, it considers not lru > type granularity but zone ) > But I think it can prevent unnecessary OOM problem and solve deadlock problem. Can you please elaborate your intention? Do you think Wu's approach is wrong? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail172.messagelabs.com (mail172.messagelabs.com [216.82.254.3]) by kanga.kvack.org (Postfix) with SMTP id 3C35A6B00A5 for ; Mon, 18 Oct 2010 21:21:41 -0400 (EDT) Received: from m2.gw.fujitsu.co.jp ([10.0.50.72]) by fgwmail7.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o9J1LZQa010419 for (envelope-from kosaki.motohiro@jp.fujitsu.com); Tue, 19 Oct 2010 10:21:35 +0900 Received: from smail (m2 [127.0.0.1]) by outgoing.m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 722F745DE4E for ; Tue, 19 Oct 2010 10:21:35 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (s2.gw.fujitsu.co.jp [10.0.50.92]) by m2.gw.fujitsu.co.jp (Postfix) with ESMTP id 425D245DE51 for ; Tue, 19 Oct 2010 10:21:35 +0900 (JST) Received: from s2.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id 26745E08001 for ; Tue, 19 Oct 2010 10:21:35 +0900 (JST) Received: from m106.s.css.fujitsu.com (m106.s.css.fujitsu.com [10.249.87.106]) by s2.gw.fujitsu.co.jp (Postfix) with ESMTP id DBC131DB803A for ; Tue, 19 Oct 2010 10:21:34 +0900 (JST) From: KOSAKI Motohiro Subject: Re: Deadlock possibly caused by too_many_isolated. In-Reply-To: References: <20101019095144.A1B0.A69D9226@jp.fujitsu.com> Message-Id: <20101019102114.A1B9.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Date: Tue, 19 Oct 2010 10:21:32 +0900 (JST) Sender: owner-linux-mm@kvack.org To: Minchan Kim Cc: kosaki.motohiro@jp.fujitsu.com, Andrew Morton , Neil Brown , Wu Fengguang , Rik van Riel , KAMEZAWA Hiroyuki , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Li, Shaohua" List-ID: > On Tue, Oct 19, 2010 at 9:57 AM, KOSAKI Motohiro > wrote: > >> > I think there are two bugs here. > >> > The raid1 bug that Torsten mentions is certainly real (and has been = around > >> > for an embarrassingly long time). > >> > The bug that I identified in too_many_isolated is also a real bug an= d can be > >> > triggered without md/raid1 in the mix. > >> > So this is not a 'full fix' for every bug in the kernel :-), but it = could > >> > well be a full fix for this particular bug. > >> > > >> > >> Can we just delete the too_many_isolated() logic? =A0(Crappy comment > >> describes what the code does but not why it does it). > > > > if my remember is correct, we got bug report that LTP may makes misteri= ous > > OOM killer invocation about 1-2 years ago. because, if too many paroces= s are in > > reclaim path, all of reclaimable pages can be isolated and last reclaim= er found > > the system don't have any reclaimable pages and lead to invoke OOM kill= er. > > We have strong motivation to avoid false positive oom. then, some discu= sstion > > made this patch. > > > > if my remember is incorrect, I hope Wu or Rik fix me. >=20 > AFAIR, it's right. >=20 > How about this? >=20 > It's rather aggressive throttling than old(ie, it considers not lru > type granularity but zone ) > But I think it can prevent unnecessary OOM problem and solve deadlock pro= blem. Can you please elaborate your intention? Do you think Wu's approach is wron= g? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org