linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	David Howells <dhowells@redhat.com>,
	Hugh Dickins <hughd@google.com>
Subject: Re: Folios for 5.15 request - Was: re: Folio discussion recap -
Date: Fri, 22 Oct 2021 02:52:31 +0100	[thread overview]
Message-ID: <YXIZX0truEBv2YSz@casper.infradead.org> (raw)
In-Reply-To: <YXHdpQTL1Udz48fc@cmpxchg.org>

On Thu, Oct 21, 2021 at 05:37:41PM -0400, Johannes Weiner wrote:
> Here is my summary of the discussion, and my conclusion:

Thank you for this.  It's the clearest, most useful post on this thread,
including my own.  It really highlights the substantial points that
should be discussed.

> The premise of the folio was initially to simply be a type that says:
> I'm the headpage for one or more pages. Never a tailpage. Cool.
> 
> However, after we talked about what that actually means, we seem to
> have some consensus on the following:
> 
> 	1) If folio is to be a generic headpage, it'll be the new
> 	   dumping ground for slab, network, drivers etc. Nobody is
> 	   psyched about this, hence the idea to split the page into
> 	   subtypes which already resulted in the struct slab patches.
> 
> 	2) If higher-order allocations are going to be the norm, it's
> 	   wasteful to statically allocate full descriptors at a 4k
> 	   granularity. Hence the push to eliminate overloading and do
> 	   on-demand allocation of necessary descriptor space.
> 
> I think that's accurate, but for the record: is there anybody who
> disagrees with this and insists that struct folio should continue to
> be the dumping ground for all kinds of memory types?

I think there's a useful distinction to be drawn between "where we're
going with this patchset", "where we're going in the next six-twelve
months" and "where we're going eventually".  I think we have minor
differences of opinion on the answers to those questions, and they can
be resolved as we go, instead of up-front.

My answer to that question is that, while this full conversion is not
part of this patch, struct folio is logically:

struct folio {
	... almost everything that's currently in struct page ...
};

struct page {
    unsigned long flags;
    unsigned long compound_head;
    union {
        struct { /* First tail page only */
            unsigned char compound_dtor;
            unsigned char compound_order;
            atomic_t compound_mapcount;
            unsigned int compound_nr;
        };
        struct { /* Second tail page only */
            atomic_t hpage_pinned_refcount;
            struct list_head deferred_list;
        };
        unsigned long padding1[4];
    };
    unsigned int padding2[2];
#ifdef CONFIG_MEMCG
    unsigned long padding3;
#endif
#ifdef WANT_PAGE_VIRTUAL
    void *virtual;
#endif
#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
    int _last_cpupid;
#endif
};

(I'm open to being told I have some of that wrong, eg maybe _last_cpupid
is actually part of struct folio and isn't a per-page property at all)

I'd like to get there in the next year.  I think dynamically allocating
memory descriptors is more than a year out.

Now, as far as struct folio being a dumping group, I would like to
split other things out from struct folio.  Let me address that below.

> Let's assume the answer is "no" for now and move on.
> 
> If folios are NOT the common headpage type, it begs two questions:
> 
> 	1) What subtype(s) of page SHOULD it represent?
> 
> 	   This is somewhat unclear at this time. Some say file+anon.
> 	   It's also been suggested everything userspace-mappable, but
> 	   that would again bring back major type punning. Who knows?
> 
> 	   Vocal proponents of the folio type have made conflicting
> 	   statements on this, which certainly gives me pause.
> 
> 	2) What IS the common type used for attributes and code shared
> 	   between subtypes?
> 
> 	   For example: if a folio is anon+file, then the code that
>            maps memory to userspace needs a generic type in order to
>            map both folios and network pages. Same as the page table
>            walkers, and things like GUP.
> 
> 	   Will this common type be struct page? Something new? Are we
> 	   going to duplicate the implementation for each subtype?
> 
> 	   Another example: GUP can return tailpages. I don't see how
> 	   it could return folio with even its most generic definition
> 	   of "headpage".
> 
> (But bottomline, it's not clear how folio can be the universal
> headpage type and simultaneously avoid being the type dumping ground
> that the page was. Maybe I'm not creative enough?)

This whole section is predicated on "If it is NOT the headpage type",
but I think this is a great list of why it _should_ be the generic
headpage type.

To answer a questions in here; GUP should continue to return precise
pages because that's what its callers expect.  But we should have a
better interface than GUP which returns a rather more compressed list
(something like today's biovec).

> Anyway. I can even be convinved that we can figure out the exact fault
> lines along which we split the page down the road.
> 
> My worry is more about 2). A shared type and generic code is likely to
> emerge regardless of how we split it. Think about it, the only world
> in which that isn't true would be one in which either
> 
> 	a) page subtypes are all the same, or
> 	b) the subtypes have nothing in common
> 
> and both are clearly bogus.

Amen!

I'm convinced that pgtable, slab and zsmalloc uses of struct page can all
be split out into their own types instead of being folios.  They have
little-to-nothing in common with anon+file; they can't be mapped into
userspace and they can't be on the LRU.  The only situation you can find
them in is something like compaction which walks PFNs.

I don't think we can split out ZONE_DEVICE and netpool into their own
types.  While they can't be on the LRU, they can be mapped to userspace,
like random device drivers.  So they can be found by GUP, and we want
(need) to be able to go to folio from there in order to get, lock and
set a folio as dirty.  Also, they have a mapcount as well as a refcount.

The real question, I think, is whether it's worth splitting anon & file
pages out from generic pages.  I can see arguments for it, but I can also
see arguments against it (whether it's two types: lru_mem and folio,
three types: anon_mem, file_mem and folio or even four types: ksm_mem,
anon_mem and file_mem).  I don't think a compelling argument has been
made either way.

Perhaps you could comment on how you'd see separate anon_mem and
file_mem types working for the memcg code?  Would you want to have
separate lock_anon_memcg() and lock_file_memcg(), or would you want
them to be cast to a common type like lock_folio_memcg()?

P.S. One variant we haven't explored is separating type specialisation
from finding the head page.  eg, instead of having

struct slab *slab = page_slab(page);

we could have:

struct slab *slab = folio_slab(page_folio(page));

I don't think it's particularly worth doing, but Kent mused about it
at one point.

  reply	other threads:[~2021-10-22  1:55 UTC|newest]

Thread overview: 162+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-23 19:01 [GIT PULL] Memory folios for v5.15 Matthew Wilcox
2021-08-23 21:26 ` Johannes Weiner
2021-08-23 22:06   ` Linus Torvalds
2021-08-24  2:20     ` Matthew Wilcox
2021-08-24 13:04     ` Matthew Wilcox
2021-08-23 22:15   ` Matthew Wilcox
2021-08-24 18:32     ` Johannes Weiner
2021-08-24 18:59       ` Linus Torvalds
2021-08-25  6:39         ` Christoph Hellwig
2021-08-24 19:44       ` Matthew Wilcox
2021-08-25 15:13         ` Johannes Weiner
2021-08-26  0:45           ` Darrick J. Wong
2021-08-27 14:07             ` Johannes Weiner
2021-08-27 18:44               ` Matthew Wilcox
2021-08-27 21:41                 ` Dan Williams
2021-08-27 21:49                   ` Matthew Wilcox
2021-08-30 17:32                 ` Johannes Weiner
2021-08-30 18:22                   ` Matthew Wilcox
2021-08-30 20:27                     ` Johannes Weiner
2021-08-30 21:38                       ` Matthew Wilcox
2021-08-31 17:40                         ` Vlastimil Babka
2021-09-01 17:43                         ` Johannes Weiner
2021-09-02 15:13                           ` Zi Yan
2021-09-06 14:00                             ` Vlastimil Babka
2021-08-31 18:50                       ` Eric W. Biederman
2021-08-26  8:58         ` David Howells
2021-08-27 10:03           ` Johannes Weiner
2021-08-27 12:05             ` Matthew Wilcox
2021-08-27 10:49           ` David Howells
2021-08-24 15:54   ` David Howells
2021-08-24 17:56     ` Matthew Wilcox
2021-08-24 18:26       ` Linus Torvalds
2021-08-24 18:29         ` Linus Torvalds
2021-08-24 19:26           ` Theodore Ts'o
2021-08-24 19:34           ` David Howells
2021-08-24 20:02             ` Theodore Ts'o
2021-08-24 21:32             ` David Howells
2021-08-25 12:08               ` Jeff Layton
2021-08-24 19:01         ` Matthew Wilcox
2021-08-24 19:11           ` Linus Torvalds
2021-08-24 19:23             ` Matthew Wilcox
2021-08-24 19:44               ` Theodore Ts'o
2021-08-24 20:00                 ` Matthew Wilcox
2021-08-25  6:32                 ` Christoph Hellwig
2021-08-25  9:01                   ` Rasmus Villemoes
2021-08-26  6:32                     ` Amir Goldstein
2021-08-25 12:03                   ` Jeff Layton
2021-08-26  0:59                     ` Darrick J. Wong
2021-08-26  4:02                   ` Nicholas Piggin
2021-09-01 12:58                 ` Mike Rapoport
2021-08-24 19:35             ` David Howells
2021-08-24 20:35               ` Vlastimil Babka
2021-08-24 20:40                 ` Vlastimil Babka
2021-08-24 19:11         ` David Howells
2021-08-24 19:25           ` Linus Torvalds
2021-08-24 19:38             ` Linus Torvalds
2021-08-24 19:48               ` Linus Torvalds
2021-08-26 17:18                 ` Matthew Wilcox
2021-08-24 19:59             ` David Howells
2021-10-05 13:52   ` Matthew Wilcox
2021-10-05 17:29     ` Johannes Weiner
2021-10-05 17:32       ` David Hildenbrand
2021-10-05 18:30       ` Matthew Wilcox
2021-10-05 19:56         ` Jason Gunthorpe
2021-08-28  3:29 ` Matthew Wilcox
2021-09-09 12:43 ` Christoph Hellwig
2021-09-09 13:56   ` Vlastimil Babka
2021-09-09 18:16     ` Johannes Weiner
2021-09-09 18:44       ` Matthew Wilcox
2021-09-09 22:03         ` Johannes Weiner
2021-09-09 22:48           ` Matthew Wilcox
2021-09-09 19:17     ` John Hubbard
2021-09-09 19:23       ` Matthew Wilcox
2021-09-10 20:16 ` Folio discussion recap Kent Overstreet
2021-09-11  1:23   ` Kirill A. Shutemov
2021-09-13 11:32     ` Michal Hocko
2021-09-13 18:12       ` Johannes Weiner
2021-09-15 15:40   ` Johannes Weiner
2021-09-15 17:55     ` Damian Tometzki
2021-09-16  2:58     ` Darrick J. Wong
2021-09-16 16:54       ` Johannes Weiner
2021-09-17  5:24         ` Dave Chinner
2021-09-17  7:18           ` Christoph Hellwig
2021-09-17 16:31           ` Johannes Weiner
2021-09-17 20:57             ` Kirill A. Shutemov
2021-09-17 21:17               ` Kent Overstreet
2021-09-17 22:02                 ` Kirill A. Shutemov
2021-09-17 22:21                   ` Kent Overstreet
2021-09-17 23:15               ` Johannes Weiner
2021-09-20 10:03                 ` Kirill A. Shutemov
2021-09-17 21:13             ` Kent Overstreet
2021-09-17 22:25               ` Theodore Ts'o
2021-09-17 23:35                 ` Josef Bacik
2021-09-18  1:04             ` Dave Chinner
2021-09-18  4:51               ` Kent Overstreet
2021-09-20  1:04                 ` Dave Chinner
2021-09-16 21:58       ` David Howells
2021-09-20  2:17   ` Matthew Wilcox
2021-09-21 19:47     ` Johannes Weiner
2021-09-21 20:38       ` Matthew Wilcox
2021-09-21 21:11         ` Kent Overstreet
2021-09-21 21:22           ` Folios for 5.15 request - Was: re: Folio discussion recap - Kent Overstreet
2021-09-22 15:08             ` Johannes Weiner
2021-09-22 15:46               ` Kent Overstreet
2021-09-22 16:26                 ` Matthew Wilcox
2021-09-22 16:56                   ` Chris Mason
2021-09-22 19:54                     ` Matthew Wilcox
2021-09-22 20:15                       ` Kent Overstreet
2021-09-22 20:21                       ` Linus Torvalds
2021-09-23  5:42               ` Kent Overstreet
2021-09-23 18:00                 ` Johannes Weiner
2021-09-23 19:31                   ` Matthew Wilcox
2021-09-23 20:20                   ` Kent Overstreet
2021-10-16  3:28               ` Matthew Wilcox
2021-10-18 16:47                 ` Johannes Weiner
2021-10-18 18:12                   ` Kent Overstreet
2021-10-18 20:45                     ` Johannes Weiner
2021-10-19 16:11                       ` Splitting struct page into multiple types " Kent Overstreet
2021-10-19 17:06                         ` Gao Xiang
2021-10-19 17:34                           ` Matthew Wilcox
2021-10-19 17:54                             ` Gao Xiang
2021-10-20 17:46                               ` Kent Overstreet
2021-10-19 17:37                         ` Jason Gunthorpe
2021-10-19 21:14                       ` David Howells
2021-10-18 18:28                   ` Folios for 5.15 request " Matthew Wilcox
2021-10-18 21:56                     ` Johannes Weiner
2021-10-18 23:16                       ` Kirill A. Shutemov
2021-10-19 15:16                         ` Johannes Weiner
2021-10-20  3:19                           ` Matthew Wilcox
2021-10-20  7:50                           ` David Hildenbrand
2021-10-20 17:26                             ` Matthew Wilcox
2021-10-20 18:04                               ` David Hildenbrand
2021-10-21  6:51                                 ` Christoph Hellwig
2021-10-21  7:21                                   ` David Hildenbrand
2021-10-21 12:03                                     ` Kent Overstreet
2021-10-21 12:35                                       ` David Hildenbrand
2021-10-21 12:38                                         ` Christoph Hellwig
2021-10-21 13:00                                           ` David Hildenbrand
2021-10-21 12:41                                         ` Matthew Wilcox
2021-10-20 17:39                           ` Kent Overstreet
2021-10-21 21:37                             ` Johannes Weiner
2021-10-22  1:52                               ` Matthew Wilcox [this message]
2021-10-22  7:59                                 ` David Hildenbrand
2021-10-22 13:01                                   ` Matthew Wilcox
2021-10-22 14:40                                     ` David Hildenbrand
2021-10-23  2:22                                       ` Matthew Wilcox
2021-10-23  5:02                                         ` Christoph Hellwig
2021-10-23  9:58                                         ` David Hildenbrand
2021-10-23 16:00                                           ` Kent Overstreet
2021-10-23 21:41                                             ` Matthew Wilcox
2021-10-23 22:23                                               ` Kent Overstreet
2021-10-25 15:35                                 ` Johannes Weiner
2021-10-25 15:52                                   ` Matthew Wilcox
2021-10-25 16:05                                   ` Kent Overstreet
2021-10-16 19:07               ` Matthew Wilcox
2021-10-18 17:25                 ` Johannes Weiner
2021-09-21 22:18           ` Folio discussion recap Matthew Wilcox
2021-09-23  0:45             ` Ira Weiny
2021-09-23  3:41               ` Matthew Wilcox
2021-09-23 22:12                 ` Ira Weiny
2021-09-29 15:24                   ` Matthew Wilcox
2021-09-21 21:59         ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXIZX0truEBv2YSz@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=hughd@google.com \
    --cc=kent.overstreet@gmail.com \
    --cc=kirill@shutemov.name \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).