From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933572Ab0JSA5b (ORCPT ); Mon, 18 Oct 2010 20:57:31 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:48721 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755726Ab0JSA5a (ORCPT ); Mon, 18 Oct 2010 20:57:30 -0400 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 From: KOSAKI Motohiro To: Andrew Morton Subject: Re: Deadlock possibly caused by too_many_isolated. Cc: kosaki.motohiro@jp.fujitsu.com, Neil Brown , Wu Fengguang , Rik van Riel , KAMEZAWA Hiroyuki , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Li, Shaohua" In-Reply-To: <20101018154137.90f5325f.akpm@linux-foundation.org> References: <20101019093142.509d6947@notabene> <20101018154137.90f5325f.akpm@linux-foundation.org> Message-Id: <20101019095144.A1B0.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.50.07 [ja] Date: Tue, 19 Oct 2010 09:57:27 +0900 (JST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > I think there are two bugs here. > > The raid1 bug that Torsten mentions is certainly real (and has been around > > for an embarrassingly long time). > > The bug that I identified in too_many_isolated is also a real bug and can be > > triggered without md/raid1 in the mix. > > So this is not a 'full fix' for every bug in the kernel :-), but it could > > well be a full fix for this particular bug. > > > > Can we just delete the too_many_isolated() logic? (Crappy comment > describes what the code does but not why it does it). if my remember is correct, we got bug report that LTP may makes misterious OOM killer invocation about 1-2 years ago. because, if too many parocess are in reclaim path, all of reclaimable pages can be isolated and last reclaimer found the system don't have any reclaimable pages and lead to invoke OOM killer. We have strong motivation to avoid false positive oom. then, some discusstion made this patch. if my remember is incorrect, I hope Wu or Rik fix me. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail203.messagelabs.com (mail203.messagelabs.com [216.82.254.243]) by kanga.kvack.org (Postfix) with SMTP id AFEAC6B004A for ; Mon, 18 Oct 2010 20:57:31 -0400 (EDT) Received: from m1.gw.fujitsu.co.jp ([10.0.50.71]) by fgwmail7.fujitsu.co.jp (Fujitsu Gateway) with ESMTP id o9J0vT3i032449 for (envelope-from kosaki.motohiro@jp.fujitsu.com); Tue, 19 Oct 2010 09:57:29 +0900 Received: from smail (m1 [127.0.0.1]) by outgoing.m1.gw.fujitsu.co.jp (Postfix) with ESMTP id 3AC0B45DE58 for ; Tue, 19 Oct 2010 09:57:29 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (s1.gw.fujitsu.co.jp [10.0.50.91]) by m1.gw.fujitsu.co.jp (Postfix) with ESMTP id EB99F45DE51 for ; Tue, 19 Oct 2010 09:57:28 +0900 (JST) Received: from s1.gw.fujitsu.co.jp (localhost.localdomain [127.0.0.1]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id B2C091DB804E for ; Tue, 19 Oct 2010 09:57:28 +0900 (JST) Received: from m105.s.css.fujitsu.com (m105.s.css.fujitsu.com [10.249.87.105]) by s1.gw.fujitsu.co.jp (Postfix) with ESMTP id 715FAE08001 for ; Tue, 19 Oct 2010 09:57:28 +0900 (JST) From: KOSAKI Motohiro Subject: Re: Deadlock possibly caused by too_many_isolated. In-Reply-To: <20101018154137.90f5325f.akpm@linux-foundation.org> References: <20101019093142.509d6947@notabene> <20101018154137.90f5325f.akpm@linux-foundation.org> Message-Id: <20101019095144.A1B0.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Date: Tue, 19 Oct 2010 09:57:27 +0900 (JST) Sender: owner-linux-mm@kvack.org To: Andrew Morton Cc: kosaki.motohiro@jp.fujitsu.com, Neil Brown , Wu Fengguang , Rik van Riel , KAMEZAWA Hiroyuki , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "Li, Shaohua" List-ID: > > I think there are two bugs here. > > The raid1 bug that Torsten mentions is certainly real (and has been around > > for an embarrassingly long time). > > The bug that I identified in too_many_isolated is also a real bug and can be > > triggered without md/raid1 in the mix. > > So this is not a 'full fix' for every bug in the kernel :-), but it could > > well be a full fix for this particular bug. > > > > Can we just delete the too_many_isolated() logic? (Crappy comment > describes what the code does but not why it does it). if my remember is correct, we got bug report that LTP may makes misterious OOM killer invocation about 1-2 years ago. because, if too many parocess are in reclaim path, all of reclaimable pages can be isolated and last reclaimer found the system don't have any reclaimable pages and lead to invoke OOM killer. We have strong motivation to avoid false positive oom. then, some discusstion made this patch. if my remember is incorrect, I hope Wu or Rik fix me. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org