From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19999C433E1 for ; Tue, 18 Aug 2020 15:03:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D31FE20786 for ; Tue, 18 Aug 2020 15:03:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D31FE20786 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 615948D0012; Tue, 18 Aug 2020 11:03:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5C6AD8D0003; Tue, 18 Aug 2020 11:03:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4DCA78D0012; Tue, 18 Aug 2020 11:03:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id 3741E8D0003 for ; Tue, 18 Aug 2020 11:03:05 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id BCCB5180ACF75 for ; Tue, 18 Aug 2020 15:03:04 +0000 (UTC) X-FDA: 77164007088.19.paper31_1b1470e27020 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 8324A1ACC32 for ; Tue, 18 Aug 2020 15:02:52 +0000 (UTC) X-HE-Tag: paper31_1b1470e27020 X-Filterd-Recvd-Size: 4980 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Tue, 18 Aug 2020 15:02:34 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 89C94AFC1; Tue, 18 Aug 2020 15:02:59 +0000 (UTC) Date: Tue, 18 Aug 2020 17:02:32 +0200 From: Michal Hocko To: "Paul E. McKenney" Cc: Uladzislau Rezki , Peter Zijlstra , Thomas Gleixner , LKML , RCU , linux-mm@kvack.org, Andrew Morton , Vlastimil Babka , Matthew Wilcox , "Theodore Y . Ts'o" , Joel Fernandes , Sebastian Andrzej Siewior , Oleksiy Avramchenko Subject: Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag Message-ID: <20200818150232.GQ28270@dhcp22.suse.cz> References: <20200814174924.GI3982@worktop.programming.kicks-ass.net> <20200814180224.GQ4295@paulmck-ThinkPad-P72> <875z9lkoo4.fsf@nanos.tec.linutronix.de> <20200814204140.GT4295@paulmck-ThinkPad-P72> <20200814215206.GL3982@worktop.programming.kicks-ass.net> <20200816225655.GA17869@pc636> <20200817082849.GA28270@dhcp22.suse.cz> <20200817222803.GE23602@paulmck-ThinkPad-P72> <20200818074344.GL28270@dhcp22.suse.cz> <20200818135327.GF23602@paulmck-ThinkPad-P72> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200818135327.GF23602@paulmck-ThinkPad-P72> X-Rspamd-Queue-Id: 8324A1ACC32 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue 18-08-20 06:53:27, Paul E. McKenney wrote: > On Tue, Aug 18, 2020 at 09:43:44AM +0200, Michal Hocko wrote: > > On Mon 17-08-20 15:28:03, Paul E. McKenney wrote: > > > On Mon, Aug 17, 2020 at 10:28:49AM +0200, Michal Hocko wrote: > > > > On Mon 17-08-20 00:56:55, Uladzislau Rezki wrote: > > > > > > [ . . . ] > > > > > > > > wget ftp://vps418301.ovh.net/incoming/1000000_kmalloc_kfree_rcu_proc_percpu_pagelist_fractio_is_8.png > > > > > > > > 1/8 of the memory in pcp lists is quite large and likely not something > > > > used very often. > > > > > > > > Both these numbers just make me think that a dedicated pool of page > > > > pre-allocated for RCU specifically might be a better solution. I still > > > > haven't read through that branch of the email thread though so there > > > > might be some pretty convincing argments to not do that. > > > > > > To avoid the problematic corner cases, we would need way more dedicated > > > memory than is reasonable, as in well over one hundred pages per CPU. > > > Sure, we could choose a smaller number, but then we are failing to defend > > > against flooding, even on systems that have more than enough free memory > > > to be able to do so. It would be better to live within what is available, > > > taking the performance/robustness hit only if there isn't enough. > > > > Thomas had a good point that it doesn't really make much sense to > > optimize for flooders because that just makes them more effective. > > The point is not to make the flooders go faster, but rather for the > system to be robust in the face of flooders. Robust as in harder for > a flooder to OOM the system. Do we see this to be a practical problem? I am really confused because the initial argument was revolving around an optimization now you are suggesting that this is actually system stability measure. And I fail to see how allowing an easy way to deplete pcp caches completely solves any of that. Please do realize that if allow that then every user who relies on pcp caches will have to take a slow(er) path and that will have performance consequences. The pool is a global and a scarce resource. That's why I've suggested a dedicated preallocated pool and use it instead of draining global pcp caches. > And reducing the number of post-grace-period cache misses makes it > easier for the callback-invocation-time memory freeing to keep up with > the flooder, thus avoiding (or at least delaying) the OOM. > > > > My current belief is that we need a combination of (1) either the > > > GFP_NOLOCK flag or Peter Zijlstra's patch and > > > > I must have missed the patch? > > If I am keeping track, this one: > > https://lore.kernel.org/lkml/20200814215206.GL3982@worktop.programming.kicks-ass.net/ OK, I have certainly noticed that one but didn't react but my response would be similar to the dedicated gfp flag. This is less of a hack than __GFP_NOLOCK but it still exposes very internal parts of the allocator and I find that a quite problematic from the future maintenance of the allocator. The risk of an easy depletion of the pcp pool is also there of course. -- Michal Hocko SUSE Labs