From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E68E6C433FE for ; Thu, 23 Sep 2021 17:58:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 42DB160F4B for ; Thu, 23 Sep 2021 17:58:47 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 42DB160F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 6F7DB6B0082; Thu, 23 Sep 2021 13:58:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 67F1D940007; Thu, 23 Sep 2021 13:58:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AB7C6B0085; Thu, 23 Sep 2021 13:58:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 354E66B0082 for ; Thu, 23 Sep 2021 13:58:46 -0400 (EDT) Received: from smtpin36.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id F0C088249980 for ; Thu, 23 Sep 2021 17:58:45 +0000 (UTC) X-FDA: 78619598610.36.D5ADE93 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf21.hostedemail.com (Postfix) with ESMTP id 7DE68D02F48E for ; Thu, 23 Sep 2021 17:58:45 +0000 (UTC) Received: by mail-qk1-f179.google.com with SMTP id b65so24513857qkc.13 for ; Thu, 23 Sep 2021 10:58:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=7hr09IOKvubmm29o++rFk+/rcGzXjuS4I9xs9JPMjPM=; b=xVrtaAV+AsQi2BtDts7Ynr0kR8zUdzyqR/5bsUUkNlbXDC3iRZUM9xhBAOmIhROQ26 9chb8pmqfl3zsYrh9S56a0JKFfDoFmnpqAiopTSoH9QmLR1iB7T4QdR8XBUh5yAemX3B vuIbRXQx5n7OfNqMHJnXjLDG5ERJeqI/LCv5YtFmPQJiw/Ff2CB4XYeSlmQk49oxSg/z 75M7XyC3NdvdzqYrUCBxRCEK4V1hrbFOfyKijDabbV+JstZPQYgmBFzoEgcb+0aJimr3 EFxmibhDN2G4nTE7BI0jHL4QcrzcFSN+1Tf8AkVhqAjMC0R5nTZlnI3Fhv3xOJFNN7Qk 57zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=7hr09IOKvubmm29o++rFk+/rcGzXjuS4I9xs9JPMjPM=; b=Pll8j4Jy5yiYuCxoACo1yQ5p7bTx1qzSbAGYE2SUXhnHjwYnzWZ2qh8ScML3hR/1iC HH19LYWCYgt699nDuCGhmX+ixCNIoIDwWB2GMfgqb+0BKrziiZLt9EVYhTKbuLkWT1U7 4xZxBRL8X378NfOQGOls/lxBvW6Kus9JUVX0qZ09JMJq/T7+lEPhcKjPNYB3No0cKm7g jibTbkzM7w7uQYF69k+PNLYsnlqqXkClkG8DnZOvUIj57gyUi+RYCqKpX3Vmx9dmACMV AcPCcPjglkM+gkXA+5AqEVSu2IIwNpcPRS1Q/SR0roZ97Pwfe3HO6+fPC9MFl2oOAZjc L2mQ== X-Gm-Message-State: AOAM530mZ7ndDuDMOwoZXtu2kZUT2NHPq6QR7+SDCljdLdcLxtcNK9p9 /yUFK6PzShKxXnBOAp8MtthreQ== X-Google-Smtp-Source: ABdhPJwXOWvF8OXeFLlBmZ7m3kp313BUPkFU7C1MgqkzrCm19j1rIEy8cN/gIVXkLtERTDASGuR+Pw== X-Received: by 2002:a37:9d96:: with SMTP id g144mr6019157qke.23.1632419924651; Thu, 23 Sep 2021 10:58:44 -0700 (PDT) Received: from localhost (cpe-98-15-154-102.hvc.res.rr.com. [98.15.154.102]) by smtp.gmail.com with ESMTPSA id t194sm4994003qka.72.2021.09.23.10.58.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Sep 2021 10:58:43 -0700 (PDT) Date: Thu, 23 Sep 2021 14:00:46 -0400 From: Johannes Weiner To: Kent Overstreet Cc: Linus Torvalds , Matthew Wilcox , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , "Darrick J. Wong" , Christoph Hellwig , David Howells Subject: Re: Folios for 5.15 request - Was: re: Folio discussion recap - Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 7DE68D02F48E X-Stat-Signature: 188r31prz65ihut9m4w4ayjwq35cu3wx Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=xVrtaAV+; spf=pass (imf21.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org X-Rspamd-Server: rspam06 X-HE-Tag: 1632419925-30976 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 23, 2021 at 01:42:17AM -0400, Kent Overstreet wrote: > On Wed, Sep 22, 2021 at 11:08:58AM -0400, Johannes Weiner wrote: > > On Tue, Sep 21, 2021 at 05:22:54PM -0400, Kent Overstreet wrote: > > > - it's become apparent that there haven't been any real objections to the code > > > that was queued up for 5.15. There _are_ very real discussions and points of > > > contention still to be decided and resolved for the work beyond file backed > > > pages, but those discussions were what derailed the more modest, and more > > > badly needed, work that affects everyone in filesystem land > > > > Unfortunately, I think this is a result of me wanting to discuss a way > > forward rather than a way back. > > > > To clarify: I do very much object to the code as currently queued up, > > and not just to a vague future direction. > > > > The patches add and convert a lot of complicated code to provision for > > a future we do not agree on. The indirections it adds, and the hybrid > > state it leaves the tree in, make it directly more difficult to work > > with and understand the MM code base. Stuff that isn't needed for > > exposing folios to the filesystems. > > I think something we need is an alternate view - anon_folio, perhaps - and an > idea of what that would look like. Because you've been saying you don't think > file pages and anymous pages are similar enough to be the same time - so if > they're not, how's the code that works on both types of pages going to change to > accomadate that? > > Do we have if (file_folio) else if (anon_folio) both doing the same thing, but > operating on different types? Some sort of subclassing going on? Yeah, with subclassing and a generic type for shared code. I outlined that earlier in the thread: https://lore.kernel.org/all/YUo20TzAlqz8Tceg@cmpxchg.org/ So you have anon_page and file_page being subclasses of page - similar to how filesystems have subclasses that inherit from struct inode - to help refactor what is generic, what isn't, and highlight what should be. Whether we do anon_page and file_page inheriting from struct page, or anon_folio and file_folio inheriting from struct folio - either would work of course. Again I think it comes down to the value proposition of folio as a means to clean up compound pages inside the MM code. It's pretty uncontroversial that we want PAGE_SIZE assumptions gone from the filesystems, networking, drivers and other random code. The argument for MM code is a different one. We seem to be discussing the folio abstraction as a binary thing for the Linux kernel, rather than a selectively applied tool, and I think it prevents us from doing proper one-by-one cost/benefit analyses on the areas of application. I suggested the anon/file split as an RFC to sidestep the cost/benefit question of doing the massive folio change in MM just to cleanup the compound pages; takeing the idea of redoing the page typing, just in a way that would maybe benefit MM code more broadly and obviously. > I was agreeing with you that slab/network pools etc. shouldn't be folios - that > folios shouldn't be a replacement for compound pages. But I think we're going to > need a serious alternative proposal for anonymous pages if you're still against > them becoming folios, especially because according to Kirill they're already > working on that (and you have to admit transhuge pages did introduce a mess that > they will help with...) I think we need a better analysis of that mess and a concept where tailpages are and should be, if that is the justification for the MM conversion. The motivation is that we have a ton of compound_head() calls in places we don't need them. No argument there, I think. But the explanation for going with whitelisting - the most invasive approach possible (and which leaves more than one person "unenthused" about that part of the patches) - is that it's difficult and error prone to identify which ones are necessary and which ones are not. And maybe that we'll continue to have a widespread hybrid existence of head and tail pages that will continue to require clarification. But that seems to be an article of faith. It's implied by the approach, but this may or may not be the case. I certainly think it used to be messier in the past. But strides have been made already to narrow the channels through which tail pages can actually enter the code. Certainly we can rule out entire MM subsystems and simply declare their compound_head() usage unnecessary with little risk or ambiguity. Then the question becomes which ones are legit. Whether anybody outside the page allocator ever needs to *see* a tailpage struct page to begin with. (Arguably that bit in __split_huge_page_tail() could be a page allocator function; the pte handling is pfn-based except for the mapcount management which could be encapsulated; the collapse code uses vm_normal_page() but follows it quickly by compound_head() - and arguably a tailpage generally isn't a "normal" vm page, so a new pfn_to_normal_page() could encapsulate the compound_head()). Because if not, seeing struct page in MM code isn't nearly as ambiguous as is being implied. You would never have to worry about it - unless you are in fact the page allocator. So if this problem could be solved by making tail pages an encapsulated page_alloc thing, and chasing down the rest of find_subpage() callers (which needs to happen anyway), I don't think a wholesale folio conversion of this subsystem would be justified. A more in-depth analyses of where and how we need to deal with tailpages - laying out the data structures that hold them and code entry points for them - would go a long way for making the case for folios. And might convince reluctant people to get behind the effort. Or show that we don't need it. Either way, it seems like a win-win. But I do think the onus for explaining why the particular approach was chosen against much less invasive options is on the person pushing the changes. And it should be more detailed than "we all know it sucks".