From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753599Ab1JLTUX (ORCPT ); Wed, 12 Oct 2011 15:20:23 -0400 Received: from mail-pz0-f42.google.com ([209.85.210.42]:43193 "EHLO mail-pz0-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752177Ab1JLTUV (ORCPT ); Wed, 12 Oct 2011 15:20:21 -0400 Date: Wed, 12 Oct 2011 12:20:18 -0700 From: Andrew Morton To: Rik van Riel Cc: Satoru Moriya , David Rientjes , Randy Dunlap , Satoru Moriya , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "lwoodman@redhat.com" , Seiji Aguchi , "hughd@google.com" , "hannes@cmpxchg.org" Subject: Re: [PATCH -v2 -mm] add extra free kbytes tunable Message-Id: <20111012122018.690bdf28.akpm@linux-foundation.org> In-Reply-To: <4E95917D.3080507@redhat.com> References: <20110901105208.3849a8ff@annuminas.surriel.com> <20110901100650.6d884589.rdunlap@xenotime.net> <20110901152650.7a63cb8b@annuminas.surriel.com> <20111010153723.6397924f.akpm@linux-foundation.org> <65795E11DBF1E645A09CEC7EAEE94B9CB516CBC4@USINDEVS02.corp.hds.com> <20111011125419.2702b5dc.akpm@linux-foundation.org> <65795E11DBF1E645A09CEC7EAEE94B9CB516CBFE@USINDEVS02.corp.hds.com> <20111011135445.f580749b.akpm@linux-foundation.org> <4E95917D.3080507@redhat.com> X-Mailer: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 12 Oct 2011 09:09:17 -0400 Rik van Riel wrote: > On 10/11/2011 04:54 PM, Andrew Morton wrote: > > On Tue, 11 Oct 2011 16:23:22 -0400 > > Satoru Moriya wrote: > > > >> On 10/11/2011 03:55 PM, Andrew Morton wrote: > >>> On Tue, 11 Oct 2011 15:32:11 -0400 > >>> Satoru Moriya wrote: > >>> > >>>> On 10/10/2011 06:37 PM, Andrew Morton wrote: > >>>>> On Fri, 7 Oct 2011 20:08:19 -0700 (PDT) David Rientjes > >>>>> wrote: > >>>>> > >>>>>> On Thu, 1 Sep 2011, Rik van Riel wrote: > >>>> > >>>> Actually page allocator decreases min watermark to 3/4 * min > >>>> watermark for rt-task. But in our case some applications create a lot > >>>> of processes and if all of them are rt-task, the amount of watermark > >>>> bonus(1/4 * min watermark) is not enough. > >>>> > >>>> If we can tune the amount of bonus, it may be fine. But that is > >>>> almost all same as extra free kbytes. > >>> > >>> This situation is detectable at runtime. If realtime tasks are being > >>> stalled in the page allocator then start to increase the free-page > >>> reserves. A little control system. > >> > >> Detecting at runtime is too late for some latency critical systems. > >> At that system, we must avoid a stall before it happens. > > > > It's pretty darn obvious that the kernel can easily see the situation > > developing before it happens. By comparing a few integers. > > The problem is that we may be dealing with bursts, not steady > states of allocations. Without knowing the size of a burst, > we have no idea when we should wake up kswapd to get enough > memory freed ahead of the application's allocations. That problem remains with this patch - it just takes a larger burst. Unless the admin somehow manages to configure the tunable large enough to cover the largest burst, and there aren't other applications allocating memory during that burst, and the time between bursts is sufficient for kswapd to be able to sufficiently replenish free-page reserves. All of which sounds rather unlikely. > > Look, please don't go bending over backwards like this to defend a bad > > patch. It's a bad patch! It would be better not to have to merge it. > > Let's do something better. > > I would love it if we could come up with something better, > and have thought about it a lot. > > However, so far we do not seem to have an alternative yet :( Do we actually have a real-world application which is hurting from this? From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail6.bemta8.messagelabs.com (mail6.bemta8.messagelabs.com [216.82.243.55]) by kanga.kvack.org (Postfix) with ESMTP id 5A7746B002D for ; Wed, 12 Oct 2011 15:20:22 -0400 (EDT) Received: by qadb17 with SMTP id b17so1191869qad.14 for ; Wed, 12 Oct 2011 12:20:21 -0700 (PDT) Date: Wed, 12 Oct 2011 12:20:18 -0700 From: Andrew Morton Subject: Re: [PATCH -v2 -mm] add extra free kbytes tunable Message-Id: <20111012122018.690bdf28.akpm@linux-foundation.org> In-Reply-To: <4E95917D.3080507@redhat.com> References: <20110901105208.3849a8ff@annuminas.surriel.com> <20110901100650.6d884589.rdunlap@xenotime.net> <20110901152650.7a63cb8b@annuminas.surriel.com> <20111010153723.6397924f.akpm@linux-foundation.org> <65795E11DBF1E645A09CEC7EAEE94B9CB516CBC4@USINDEVS02.corp.hds.com> <20111011125419.2702b5dc.akpm@linux-foundation.org> <65795E11DBF1E645A09CEC7EAEE94B9CB516CBFE@USINDEVS02.corp.hds.com> <20111011135445.f580749b.akpm@linux-foundation.org> <4E95917D.3080507@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Rik van Riel Cc: Satoru Moriya , David Rientjes , Randy Dunlap , Satoru Moriya , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "lwoodman@redhat.com" , Seiji Aguchi , "hughd@google.com" , "hannes@cmpxchg.org" On Wed, 12 Oct 2011 09:09:17 -0400 Rik van Riel wrote: > On 10/11/2011 04:54 PM, Andrew Morton wrote: > > On Tue, 11 Oct 2011 16:23:22 -0400 > > Satoru Moriya wrote: > > > >> On 10/11/2011 03:55 PM, Andrew Morton wrote: > >>> On Tue, 11 Oct 2011 15:32:11 -0400 > >>> Satoru Moriya wrote: > >>> > >>>> On 10/10/2011 06:37 PM, Andrew Morton wrote: > >>>>> On Fri, 7 Oct 2011 20:08:19 -0700 (PDT) David Rientjes > >>>>> wrote: > >>>>> > >>>>>> On Thu, 1 Sep 2011, Rik van Riel wrote: > >>>> > >>>> Actually page allocator decreases min watermark to 3/4 * min > >>>> watermark for rt-task. But in our case some applications create a lot > >>>> of processes and if all of them are rt-task, the amount of watermark > >>>> bonus(1/4 * min watermark) is not enough. > >>>> > >>>> If we can tune the amount of bonus, it may be fine. But that is > >>>> almost all same as extra free kbytes. > >>> > >>> This situation is detectable at runtime. If realtime tasks are being > >>> stalled in the page allocator then start to increase the free-page > >>> reserves. A little control system. > >> > >> Detecting at runtime is too late for some latency critical systems. > >> At that system, we must avoid a stall before it happens. > > > > It's pretty darn obvious that the kernel can easily see the situation > > developing before it happens. By comparing a few integers. > > The problem is that we may be dealing with bursts, not steady > states of allocations. Without knowing the size of a burst, > we have no idea when we should wake up kswapd to get enough > memory freed ahead of the application's allocations. That problem remains with this patch - it just takes a larger burst. Unless the admin somehow manages to configure the tunable large enough to cover the largest burst, and there aren't other applications allocating memory during that burst, and the time between bursts is sufficient for kswapd to be able to sufficiently replenish free-page reserves. All of which sounds rather unlikely. > > Look, please don't go bending over backwards like this to defend a bad > > patch. It's a bad patch! It would be better not to have to merge it. > > Let's do something better. > > I would love it if we could come up with something better, > and have thought about it a lot. > > However, so far we do not seem to have an alternative yet :( Do we actually have a real-world application which is hurting from this? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: email@kvack.org