From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=3.0 tests=FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A776C2BA18 for ; Mon, 6 Apr 2020 03:59:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 012D120719 for ; Mon, 6 Apr 2020 03:59:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 012D120719 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 763978E000E; Sun, 5 Apr 2020 23:59:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 714148E000D; Sun, 5 Apr 2020 23:59:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 602A68E000E; Sun, 5 Apr 2020 23:59:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 4116B8E000D for ; Sun, 5 Apr 2020 23:59:13 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E0DDC181AEF1A for ; Mon, 6 Apr 2020 03:59:12 +0000 (UTC) X-FDA: 76676074944.23.bear92_503e66415c639 X-HE-Tag: bear92_503e66415c639 X-Filterd-Recvd-Size: 4630 Received: from mail3-165.sinamail.sina.com.cn (mail3-165.sinamail.sina.com.cn [202.108.3.165]) by imf47.hostedemail.com (Postfix) with SMTP for ; Mon, 6 Apr 2020 03:59:11 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.246.227.120]) by sina.com with ESMTP id 5E8AA90900003225; Mon, 6 Apr 2020 11:59:06 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 602639628881 From: Hillf Danton To: NeilBrown Cc: Trond Myklebust , Anna Schumaker , Andrew Morton , linux-mm@kvack.org, linux-nfs@vger.kernel.org, LKML Subject: Re: [PATCH 1/2] MM: replace PF_LESS_THROTTLE with PF_LOCAL_THROTTLE Date: Mon, 6 Apr 2020 11:58:56 +0800 Message-Id: <20200406035856.13768-1-hdanton@sina.com> In-Reply-To: <20200402042644.17028-1-hdanton@sina.com> References: <87tv2b7q72.fsf@notabene.neil.brown.name> <87v9miydai.fsf@notabene.neil.brown.name> <20200402042644.17028-1-hdanton@sina.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 02 Apr 2020 15:57:56 +1100 NeilBrown wrote: >=20 > On Thu, Apr 02 2020, Hillf Danton wrote: >=20 > > On Thu, 02 Apr 2020 10:53:20 +1100 NeilBrown wrote: > >>=3D20 > >> PF_LESS_THROTTLE exists for loop-back nfsd, and a similar need in th= e > >> loop block driver, where a daemon needs to write to one bdi in > >> order to free up writes queued to another bdi. > >>=3D20 > >> The daemon sets PF_LESS_THROTTLE and gets a larger allowance of dirt= y > >> pages, so that it can still dirty pages after other processses have = been > >> throttled. > >>=3D20 > >> This approach was designed when all threads were blocked equally, > >> independently on which device they were writing to, or how fast it w= as. > >> Since that time the writeback algorithm has changed substantially wi= th > >> different threads getting different allowances based on non-trivial > >> heuristics. This means the simple "add 25%" heuristic is no longer > >> reliable. > >>=3D20 > >> This patch changes the heuristic to ignore the global limits and > >> consider only the limit relevant to the bdi being written to. This > >> approach is already available for BDI_CAP_STRICTLIMIT users (fuse) a= nd > >> should not introduce surprises. This has the desired result of > >> protecting the task from the consequences of large amounts of dirty = data > >> queued for other devices. > >>=3D20 > >> This approach of "only consider the target bdi" is consistent with t= he > >> other use of PF_LESS_THROTTLE in current_may_throttle(), were it cau= ses > >> attention to be focussed only on the target bdi. > >>=3D20 > >> So this patch > >> - renames PF_LESS_THROTTLE to PF_LOCAL_THROTTLE, > >> - remove the 25% bonus that that flag gives, and > >> - imposes 'strictlimit' handling for any process with PF_LOCAL_THRO= TTLE > >> set. > > > > /* > > * The strictlimit feature is a tool preventing mistrusted filesyste= ms > > * from growing a large number of dirty pages before throttling. For > > > > Based on the comment snippet, I suspect it is applicable to IO flushe= rs > > unless they are likely generating tons of dirty pages. If they are, > > however, cutting their bonuses seem questionable. >=20 > The purpose of the strictlimit feature was to isolate one filesystem > (bdi) from all others, so that the one cannot create dirty pages which > unfairly disadvantage the others - this is what that comment says. > But the implementation appears to focus on the isolation, not the > specific purpose, and isolation works both ways. It protects the other= s > from the one, and the one from the others. >=20 > fuse needs to be isolated so it doesn't harm others. > nfsd and loop need to be isolate so they aren't harmed by others. For those working in emergency services, extra N95 face masks and Covid-1= 9 testing kits, say 25%, would be preserved, too, if isolation doesn't help them. > I'm less familiar with IO flushers but I suspect that have exactly the > same need as nfsd and loop - they need to be isolated from dirty pages > other than on the device they are writing to. > The 25% bonus was never about giving them a bonus because they need it. > It was about protecting them from excess usage elsewhere. For example, > I strongly > suspect that my change will provide a conceptually better service for I= O > flushers. (whether it is better in a practical measurable sense I canno= t > say, but I'd be surprised if it was worse).