From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62952C433E0 for ; Mon, 29 Jun 2020 08:22:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2E4F72072D for ; Mon, 29 Jun 2020 08:22:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E4F72072D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AD67D6B000C; Mon, 29 Jun 2020 04:22:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A87476B000D; Mon, 29 Jun 2020 04:22:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 975576B0010; Mon, 29 Jun 2020 04:22:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 81BAE6B000C for ; Mon, 29 Jun 2020 04:22:13 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 3A780180AD817 for ; Mon, 29 Jun 2020 08:22:13 +0000 (UTC) X-FDA: 76981556946.05.color30_010a85326e6d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id 0EA0018019574 for ; Mon, 29 Jun 2020 08:22:13 +0000 (UTC) X-HE-Tag: color30_010a85326e6d X-Filterd-Recvd-Size: 6148 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Mon, 29 Jun 2020 08:22:12 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id ga4so15619919ejb.11 for ; Mon, 29 Jun 2020 01:22:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=mPrydynbK5+rnDYJyI/EITf6/MzUF0CRG2o081Vq5dc=; b=dWSt000AGyRRQxBqTkWATwDSQh4xraY76JwpGa8/lMuMjw+NEqzIb5TmzqBF+ansMr 1t0MPSsU6pczJ6hGmEgPUXeJDCvxkEQTejH5Qjd3SsO3/xwFLPdPkhaDsnF3R8GpMgFr MzPK7TcHkvhRcMnaH502RS3zUroKxW3jyjzC8KfLAxSPiva6/KvwAp52KdNIfFGAkpBr o4kX7AP5Ue5LzMbckOYGwEQFj+FTrb+Vy9BlVPaN/ggbVlpzytQvOntavd0orWK2f1Yz SQgjs70Mq2MnoCMBuyFXvC3ybSf+JHCw5h1aJcq64eFkeI4Tva5/JBifPyfCCkVXJiB2 v9xw== X-Gm-Message-State: AOAM5313s9BT4zQELdeEGv1eV+ic7o0Mk/InLJ85lp75hUiPcRsIZcXK wW1rkFUeC0kjHpwCGp/KZfU= X-Google-Smtp-Source: ABdhPJxpOcTkL4osdGXouPGkUvoEu+LR2SWHbzDt5dJZtwUNOJWJOn7V0ppda1Dcr/qptC9T6ZT4xQ== X-Received: by 2002:a17:906:fa9b:: with SMTP id lt27mr12396115ejb.365.1593418931557; Mon, 29 Jun 2020 01:22:11 -0700 (PDT) Received: from localhost (ip-37-188-168-3.eurotel.cz. [37.188.168.3]) by smtp.gmail.com with ESMTPSA id x10sm15592704ejc.46.2020.06.29.01.22.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jun 2020 01:22:10 -0700 (PDT) Date: Mon, 29 Jun 2020 10:22:09 +0200 From: Michal Hocko To: Dave Chinner Cc: Mikulas Patocka , "Matthew Wilcox (Oracle)" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-xfs@vger.kernel.org, dm-devel@redhat.com, Jens Axboe , NeilBrown Subject: Re: [PATCH 0/6] Overhaul memalloc_no* Message-ID: <20200629082209.GC32461@dhcp22.suse.cz> References: <20200625113122.7540-1-willy@infradead.org> <20200626230847.GI2005@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200626230847.GI2005@dread.disaster.area> X-Rspamd-Queue-Id: 0EA0018019574 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat 27-06-20 09:08:47, Dave Chinner wrote: > On Fri, Jun 26, 2020 at 11:02:19AM -0400, Mikulas Patocka wrote: > > Hi > > > > I suggest to join memalloc_noio and memalloc_nofs into just one flag that > > prevents both filesystem recursion and i/o recursion. > > > > Note that any I/O can recurse into a filesystem via the loop device, thus > > it doesn't make much sense to have a context where PF_MEMALLOC_NOFS is set > > and PF_MEMALLOC_NOIO is not set. > > Correct me if I'm wrong, but I think that will prevent swapping from > GFP_NOFS memory reclaim contexts. IOWs, this will substantially > change the behaviour of the memory reclaim system under sustained > GFP_NOFS memory pressure. Sustained GFP_NOFS memory pressure is > quite common, so I really don't think we want to telling memory > reclaim "you can't do IO at all" when all we are trying to do is > prevent recursion back into the same filesystem. > > Given that the loop device IO path already operates under > memalloc_noio context, (i.e. the recursion restriction is applied in > only the context that needs is) I see no reason for making that a > global reclaim limitation.... > > In reality, we need to be moving the other way with GFP_NOFS - to > fine grained anti-recursion contexts, not more broad contexts. Absolutely agreed! It is not really hard to see system struggling due to heavy FS metadata workload while there are objects which could be reclaimed. > That is, GFP_NOFS prevents recursion into any filesystem, not just > the one that we are actively operating on and needing to prevent > recursion back into. We can safely have reclaim do relcaim work on > other filesysetms without fear of recursion deadlocks, but the > memory reclaim infrastructure does not provide that capability.(*) > > e.g. if memalloc_nofs_save() took a reclaim context structure that > the filesystem put the superblock, the superblock's nesting depth > (because layering on loop devices can create cross-filesystem > recursion dependencies), and any other filesyetm private data the > fs wanted to add, we could actually have reclaim only avoid reclaim > from filesytsems where there is a deadlock possiblity. e.g: > > - superblock nesting depth is different, apply GFP_NOFS > reclaim unconditionally > - superblock different apply GFP_KERNEL reclaim > - superblock the same, pass context to filesystem to > decide if reclaim from the sueprblock is safe. > > At this point, we get memory reclaim able to always be able to > reclaim from filesystems that are not at risk of recursion > deadlocks. Direct reclaim is much more likely to be able to make > progress now because it is much less restricted in what it can > reclaim. That's going to make direct relcaim faster and more > efficient, and taht's the ultimate goal we are aiming to acheive > here... Yes, we have discussed something like that few years back at LSFMM IIRC. The scoped NOFS/NOIO api was just a first step to reduce explicit NOFS/NOIO usage with a hope that we will get no-recursion entry points much more well defined and get rid of many instances where "this is a fs code so it has to use NOFS gfp mask". Some of that has happened and that is really great. On the other hand many people still like to use that api as a workaround for an immediate problem because no-recursion scopes are much harder to recognize unless you are supper familiar with the specific fs/IO layer implementation. So this is definitely not a project for somebody to go over all code and just do the clean up. Thanks! -- Michal Hocko SUSE Labs