From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B71AC4320E for ; Fri, 27 Aug 2021 12:06:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B4A460FD8 for ; Fri, 27 Aug 2021 12:06:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245080AbhH0MHX (ORCPT ); Fri, 27 Aug 2021 08:07:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231191AbhH0MHW (ORCPT ); Fri, 27 Aug 2021 08:07:22 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2BD3C061757; Fri, 27 Aug 2021 05:06:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=qyXblKfzKi0M/ZBTjKDzsBk5pEo6TqTVKlmSZn0GpDg=; b=pEZAjqyUYEjAckJA/23zaJLAf/ bWCTtj4BqKKt0B2KcE7INS7Lsg4hxcA7XE3jzCZFCfMx9l61IIkuNcEvTo+Z+rZwZc15l6OWNUPKN CIORkmRhnjj4QegdpMEEY8bDm9JJrUei4QbPaqQ+EfPD6zSiO251ROVuqgkkUEoHxq9uVJh2JkPGT hSxCGNPmG6A57KluamQE1Ua9CoKJSOENpEUDSVAnqE+6coWR7cV+Fl1qQyO7PiKENtH4kzunfR239 AVMDF+jP2mNXWRKdJY8f1J16hG/rK4JpUBG6GVytF/Bc7L8LkRomHPJlGhhzt2VGKBjUWiOrkzmga 1Tle5evQ==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mJac7-00EW7t-BM; Fri, 27 Aug 2021 12:05:47 +0000 Date: Fri, 27 Aug 2021 13:05:39 +0100 From: Matthew Wilcox To: Johannes Weiner Cc: David Howells , Linus Torvalds , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton Subject: Re: [GIT PULL] Memory folios for v5.15 Message-ID: References: <2101397.1629968286@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 27, 2021 at 06:03:25AM -0400, Johannes Weiner wrote: > At the current stage of conversion, folio is a more clearly delineated > API of what can be safely used from the FS for the interaction with > the page cache and memory management. And it looks still flexible to > make all sorts of changes, including how it's backed by > memory. Compared with the page, where parts of the API are for the FS, > but there are tons of members, functions, constants, and restrictions > due to the page's role inside MM core code. Things you shouldn't be > using, things you shouldn't be assuming from the fs side, but it's > hard to tell which is which, because struct page is a lot of things. > > However, the MM narrative for folios is that they're an abstraction > for regular vs compound pages. This is rather generic. Conceptually, > it applies very broadly and deeply to MM core code: anonymous memory > handling, reclaim, swapping, even the slab allocator uses them. If we > follow through on this concept from the MM side - and that seems to be > the plan - it's inevitable that the folio API will grow more > MM-internal members, methods, as well as restrictions again in the > process. Except for the tail page bits, I don't see too much in struct > page that would not conceptually fit into this version of the folio. So the superhypermegaultra ambitious version of this does something like: struct slab_page { unsigned long flags; union { struct list_head slab_list; struct { ... }; }; struct kmem_cache *slab_cache; void *freelist; void *s_mem; unsigned int active; atomic_t _refcount; unsigned long memcg_data; }; struct folio { ... more or less as now ... }; struct net_page { unsigned long flags; unsigned long pp_magic; struct page_pool *pp; unsigned long _pp_mapping_pad; unsigned long dma_addr[2]; atomic_t _mapcount; atomic_t _refcount; unsigned long memcg_data; }; struct page { union { struct folio folio; struct slab_page slab; struct net_page pool; ... }; }; and then functions which only take one specific type of page use that type. And the compiler will tell you that you can't pass a net_page to a slab function, or vice versa. This is a lot more churn, and I'm far from convinced that it's worth doing. There's also the tricky "This page is mappable to userspace" kind of functions, which (for example) includes vmalloc and net_page as well as folios and random driver allocations, but shouldn't include slab or page table pages. They're especially tricky because mapping to userspace comes with rules around the use of the ->mapping field as well as ->_mapcount.