linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	David Howells <dhowells@redhat.com>
Subject: Re: Folios for 5.15 request - Was: re: Folio discussion recap -
Date: Mon, 18 Oct 2021 12:47:37 -0400	[thread overview]
Message-ID: <YW2lKcqwBZGDCz6T@cmpxchg.org> (raw)
In-Reply-To: <YWpG1xlPbm7Jpf2b@casper.infradead.org>

On Sat, Oct 16, 2021 at 04:28:23AM +0100, Matthew Wilcox wrote:
> On Wed, Sep 22, 2021 at 11:08:58AM -0400, Johannes Weiner wrote:
> >       mm/memcg: Add folio_memcg() and related functions
> >       mm/memcg: Convert commit_charge() to take a folio
> >       mm/memcg: Convert mem_cgroup_charge() to take a folio
> >       mm/memcg: Convert uncharge_page() to uncharge_folio()
> >       mm/memcg: Convert mem_cgroup_uncharge() to take a folio
> >       mm/memcg: Convert mem_cgroup_migrate() to take folios
> >       mm/memcg: Convert mem_cgroup_track_foreign_dirty_slowpath() to folio
> >       mm/memcg: Add folio_memcg_lock() and folio_memcg_unlock()
> >       mm/memcg: Convert mem_cgroup_move_account() to use a folio
> >       mm/memcg: Add folio_lruvec()
> >       mm/memcg: Add folio_lruvec_lock() and similar functions
> >       mm/memcg: Add folio_lruvec_relock_irq() and folio_lruvec_relock_irqsave()
> >       mm/workingset: Convert workingset_activation to take a folio	
> > 
> > 		This is all anon+file stuff, not needed for filesystem
> > 		folios.
> 
> No, that's not true.  A number of these functions are called from
> filesystem code. mem_cgroup_track_foreign_dirty() is only called
> from filesystem code. We at the very least need wrappers like
> folio_cgroup_charge(), and folio_memcg_lock().

Well, a handful of exceptions don't refute the broader point.

No objection from me to convert mem_cgroup_track_foreign_dirty().

No objection to add a mem_cgroup_charge_folio(). But I insist on the
subsystem prefix, because that's in line with how we're charging a
whole bunch of other different things (swap, skmem, etc.). It'll also
match a mem_cgroup_charge_anon() if we agree to an anon type.

folio_memcg_lock() sounds good to me.

> > 		As per the other email, no conceptual entry point for
> > 		tail pages into either subsystem, so no ambiguity
> > 		around the necessity of any compound_head() calls,
> > 		directly or indirectly. It's easy to rule out
> > 		wholesale, so there is no justification for
> > 		incrementally annotating every single use of the page.
> 
> The justification is that we can remove all those hidden calls to
> compound_head().  Hundreds of bytes of text spread throughout this file.

I find this line of argument highly disingenuous.

No new type is necessary to remove these calls inside MM code. Migrate
them into the callsites and remove the 99.9% very obviously bogus
ones. The process is the same whether you switch to a new type or not.

(I'll send more patches like the PageSlab() ones to that effect. It's
easy. The only reason nobody has bothered removing those until now is
that nobody reported regressions when they were added.)

But typesafety is an entirely different argument. And to reiterate the
main point of contention on these patches: there is no concensus among
MM people how (or whether) we want MM-internal typesafety for pages.

Personally, I think we do, but I don't think head vs tail is the most
important or the most error-prone aspect of the many identities struct
page can have. In most cases it's not even in the top 5 of questions I
have about the page when I see it in a random MM context (outside of
the very few places that do virt_to_page or pfn_to_page). Therefor
"folio" is not a very poignant way to name the object that is passed
around in most MM code. struct anon_page and struct file_page would be
way more descriptive and would imply the head/tail aspect.

Anyway, the email you are responding to was an offer to split the
uncontroversial "large pages backing filesystems" part from the
controversial "MM-internal typesafety" discussion. Several people in
both the fs space and the mm space have now asked to do this to move
ahead. Since you have stated in another subthread that you "want to
get back to working on large pages in the page cache," and you never
wanted to get involved that deeply in the struct page subtyping
efforts, it's not clear to me why you are not taking this offer.

> >       mm: Add folio_young and folio_idle
> >       mm/swap: Add folio_activate()
> >       mm/swap: Add folio_mark_accessed()
> > 
> > 		This is anon+file aging stuff, not needed.
> 
> Again, very much needed.  Take a look at pagecache_get_page().  In Linus'
> tree today, it calls if (page_is_idle(page)) clear_page_idle(page);
> So either we need wrappers (which are needlessly complicated thanks to
> how page_is_idle() is defined) or we just convert it.

I'm not sure I understand the complication. That you'd have to do

	if (page_is_idle(folio->page))
		clear_page_idle(folio->page)

inside code in mm/?

It's either that, or

a) generic code shared with anon pages has to do:

	if (folio_is_idle(page->folio))
		clear_folio_idle(page->folio)

which is a weird, or

b) both types work with their own wrappers:

	if (page_is_idle(page))
		clear_page_idle(page)

	if (folio_is_idle(folio))
		clear_folio_idle(folio)

and it's not obvious at all that they are in fact tracking the same
state.

State which is exported to userspace through the "page_idle" feature.

Doing the folio->page translation in mm/-private code, and keeping
this a page interface, is by far the most preferable solution.

> >       mm/rmap: Add folio_mkclean()
> > 
> >       mm/migrate: Add folio_migrate_mapping()
> >       mm/migrate: Add folio_migrate_flags()
> >       mm/migrate: Add folio_migrate_copy()
> > 
> > 		More anon+file conversion, not needed.
> 
> As far as I can tell, anon never calls any of these three functions.
> anon calls migrate_page(), which calls migrate_page_move_mapping(),
> but several filesystems do call these individual functions.

In the current series, migrate_page_move_mapping() has been replaced,
and anon pages go through them:

int folio_migrate_mapping(struct address_space *mapping,
                struct folio *newfolio, struct folio *folio, int extra_count)
{
	[...]
        if (!mapping) {
                /* Anonymous page without mapping */
                if (folio_ref_count(folio) != expected_count)
                        return -EAGAIN;

                /* No turning back from here */
                newfolio->index = folio->index;
                newfolio->mapping = folio->mapping;
                if (folio_test_swapbacked(folio))
                        __folio_set_swapbacked(newfolio);

That's what I'm objecting to.

I'm not objecting to adding these to the filesystem interface as thin
folio->page wrappers that call the page implementation.

> >       mm/lru: Add folio_add_lru()
> > 
> > 		LRU code, not needed.
> 
> Again, we need folio_add_lru() for filemap.  This one's more
> tractable as a wrapper function.

Please don't quote selectively to the point of it being misleading.

The original block my statement applied to was this:

      mm: Add folio_evictable()
      mm/lru: Convert __pagevec_lru_add_fn to take a folio
      mm/lru: Add folio_add_lru()

which goes way behond just being filesystem-interfacing.

I have no objection to a cache interface function for adding a folio
to the LRU (a wrapper to encapsulate the folio->page transition).

However, like with the memcg code above, the API is called lru_cache:
we have had lru_cache_add_file() and lru_cache_add_anon() in the past,
so lru_cache_add_folio() seems more appropriate - especially as long
as we still have one for pages (and maybe later one for anon pages).

---

All that to say, adding folio as a new type for file headpages with
API functions like this:

	mem_cgroup_charge_folio()
	lru_cache_add_folio()

now THAT would be an incremental change to the kernel code.

And if that new type proves like a great idea, we can do the same for
anon - whether with a shared type or with separate types.

And if it does end up the same type, in the interfaces and in the
implementation, we can merge

	mem_cgroup_charge_page()	# generic bits
	mem_cgroup_charge_folio()	# file bits
	mem_cgroup_charge_anon()	# anon bits

back into a single function, just like we've done it already for the
anon and file variants of those functions that we have had before.

And if we then want to rename that function to something we agree is
more appropriate, we can do that as yet another step.

That would actually be incremental refactoring.

  reply	other threads:[~2021-10-18 16:47 UTC|newest]

Thread overview: 162+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-23 19:01 [GIT PULL] Memory folios for v5.15 Matthew Wilcox
2021-08-23 21:26 ` Johannes Weiner
2021-08-23 22:06   ` Linus Torvalds
2021-08-24  2:20     ` Matthew Wilcox
2021-08-24 13:04     ` Matthew Wilcox
2021-08-23 22:15   ` Matthew Wilcox
2021-08-24 18:32     ` Johannes Weiner
2021-08-24 18:59       ` Linus Torvalds
2021-08-25  6:39         ` Christoph Hellwig
2021-08-24 19:44       ` Matthew Wilcox
2021-08-25 15:13         ` Johannes Weiner
2021-08-26  0:45           ` Darrick J. Wong
2021-08-27 14:07             ` Johannes Weiner
2021-08-27 18:44               ` Matthew Wilcox
2021-08-27 21:41                 ` Dan Williams
2021-08-27 21:49                   ` Matthew Wilcox
2021-08-30 17:32                 ` Johannes Weiner
2021-08-30 18:22                   ` Matthew Wilcox
2021-08-30 20:27                     ` Johannes Weiner
2021-08-30 21:38                       ` Matthew Wilcox
2021-08-31 17:40                         ` Vlastimil Babka
2021-09-01 17:43                         ` Johannes Weiner
2021-09-02 15:13                           ` Zi Yan
2021-09-06 14:00                             ` Vlastimil Babka
2021-08-31 18:50                       ` Eric W. Biederman
2021-08-26  8:58         ` David Howells
2021-08-27 10:03           ` Johannes Weiner
2021-08-27 12:05             ` Matthew Wilcox
2021-08-27 10:49           ` David Howells
2021-08-24 15:54   ` David Howells
2021-08-24 17:56     ` Matthew Wilcox
2021-08-24 18:26       ` Linus Torvalds
2021-08-24 18:29         ` Linus Torvalds
2021-08-24 19:26           ` Theodore Ts'o
2021-08-24 19:34           ` David Howells
2021-08-24 20:02             ` Theodore Ts'o
2021-08-24 21:32             ` David Howells
2021-08-25 12:08               ` Jeff Layton
2021-08-24 19:01         ` Matthew Wilcox
2021-08-24 19:11           ` Linus Torvalds
2021-08-24 19:23             ` Matthew Wilcox
2021-08-24 19:44               ` Theodore Ts'o
2021-08-24 20:00                 ` Matthew Wilcox
2021-08-25  6:32                 ` Christoph Hellwig
2021-08-25  9:01                   ` Rasmus Villemoes
2021-08-26  6:32                     ` Amir Goldstein
2021-08-25 12:03                   ` Jeff Layton
2021-08-26  0:59                     ` Darrick J. Wong
2021-08-26  4:02                   ` Nicholas Piggin
2021-09-01 12:58                 ` Mike Rapoport
2021-08-24 19:35             ` David Howells
2021-08-24 20:35               ` Vlastimil Babka
2021-08-24 20:40                 ` Vlastimil Babka
2021-08-24 19:11         ` David Howells
2021-08-24 19:25           ` Linus Torvalds
2021-08-24 19:38             ` Linus Torvalds
2021-08-24 19:48               ` Linus Torvalds
2021-08-26 17:18                 ` Matthew Wilcox
2021-08-24 19:59             ` David Howells
2021-10-05 13:52   ` Matthew Wilcox
2021-10-05 17:29     ` Johannes Weiner
2021-10-05 17:32       ` David Hildenbrand
2021-10-05 18:30       ` Matthew Wilcox
2021-10-05 19:56         ` Jason Gunthorpe
2021-08-28  3:29 ` Matthew Wilcox
2021-09-09 12:43 ` Christoph Hellwig
2021-09-09 13:56   ` Vlastimil Babka
2021-09-09 18:16     ` Johannes Weiner
2021-09-09 18:44       ` Matthew Wilcox
2021-09-09 22:03         ` Johannes Weiner
2021-09-09 22:48           ` Matthew Wilcox
2021-09-09 19:17     ` John Hubbard
2021-09-09 19:23       ` Matthew Wilcox
2021-09-10 20:16 ` Folio discussion recap Kent Overstreet
2021-09-11  1:23   ` Kirill A. Shutemov
2021-09-13 11:32     ` Michal Hocko
2021-09-13 18:12       ` Johannes Weiner
2021-09-15 15:40   ` Johannes Weiner
2021-09-15 17:55     ` Damian Tometzki
2021-09-16  2:58     ` Darrick J. Wong
2021-09-16 16:54       ` Johannes Weiner
2021-09-17  5:24         ` Dave Chinner
2021-09-17  7:18           ` Christoph Hellwig
2021-09-17 16:31           ` Johannes Weiner
2021-09-17 20:57             ` Kirill A. Shutemov
2021-09-17 21:17               ` Kent Overstreet
2021-09-17 22:02                 ` Kirill A. Shutemov
2021-09-17 22:21                   ` Kent Overstreet
2021-09-17 23:15               ` Johannes Weiner
2021-09-20 10:03                 ` Kirill A. Shutemov
2021-09-17 21:13             ` Kent Overstreet
2021-09-17 22:25               ` Theodore Ts'o
2021-09-17 23:35                 ` Josef Bacik
2021-09-18  1:04             ` Dave Chinner
2021-09-18  4:51               ` Kent Overstreet
2021-09-20  1:04                 ` Dave Chinner
2021-09-16 21:58       ` David Howells
2021-09-20  2:17   ` Matthew Wilcox
2021-09-21 19:47     ` Johannes Weiner
2021-09-21 20:38       ` Matthew Wilcox
2021-09-21 21:11         ` Kent Overstreet
2021-09-21 21:22           ` Folios for 5.15 request - Was: re: Folio discussion recap - Kent Overstreet
2021-09-22 15:08             ` Johannes Weiner
2021-09-22 15:46               ` Kent Overstreet
2021-09-22 16:26                 ` Matthew Wilcox
2021-09-22 16:56                   ` Chris Mason
2021-09-22 19:54                     ` Matthew Wilcox
2021-09-22 20:15                       ` Kent Overstreet
2021-09-22 20:21                       ` Linus Torvalds
2021-09-23  5:42               ` Kent Overstreet
2021-09-23 18:00                 ` Johannes Weiner
2021-09-23 19:31                   ` Matthew Wilcox
2021-09-23 20:20                   ` Kent Overstreet
2021-10-16  3:28               ` Matthew Wilcox
2021-10-18 16:47                 ` Johannes Weiner [this message]
2021-10-18 18:12                   ` Kent Overstreet
2021-10-18 20:45                     ` Johannes Weiner
2021-10-19 16:11                       ` Splitting struct page into multiple types " Kent Overstreet
2021-10-19 17:06                         ` Gao Xiang
2021-10-19 17:34                           ` Matthew Wilcox
2021-10-19 17:54                             ` Gao Xiang
2021-10-20 17:46                               ` Kent Overstreet
2021-10-19 17:37                         ` Jason Gunthorpe
2021-10-19 21:14                       ` David Howells
2021-10-18 18:28                   ` Folios for 5.15 request " Matthew Wilcox
2021-10-18 21:56                     ` Johannes Weiner
2021-10-18 23:16                       ` Kirill A. Shutemov
2021-10-19 15:16                         ` Johannes Weiner
2021-10-20  3:19                           ` Matthew Wilcox
2021-10-20  7:50                           ` David Hildenbrand
2021-10-20 17:26                             ` Matthew Wilcox
2021-10-20 18:04                               ` David Hildenbrand
2021-10-21  6:51                                 ` Christoph Hellwig
2021-10-21  7:21                                   ` David Hildenbrand
2021-10-21 12:03                                     ` Kent Overstreet
2021-10-21 12:35                                       ` David Hildenbrand
2021-10-21 12:38                                         ` Christoph Hellwig
2021-10-21 13:00                                           ` David Hildenbrand
2021-10-21 12:41                                         ` Matthew Wilcox
2021-10-20 17:39                           ` Kent Overstreet
2021-10-21 21:37                             ` Johannes Weiner
2021-10-22  1:52                               ` Matthew Wilcox
2021-10-22  7:59                                 ` David Hildenbrand
2021-10-22 13:01                                   ` Matthew Wilcox
2021-10-22 14:40                                     ` David Hildenbrand
2021-10-23  2:22                                       ` Matthew Wilcox
2021-10-23  5:02                                         ` Christoph Hellwig
2021-10-23  9:58                                         ` David Hildenbrand
2021-10-23 16:00                                           ` Kent Overstreet
2021-10-23 21:41                                             ` Matthew Wilcox
2021-10-23 22:23                                               ` Kent Overstreet
2021-10-25 15:35                                 ` Johannes Weiner
2021-10-25 15:52                                   ` Matthew Wilcox
2021-10-25 16:05                                   ` Kent Overstreet
2021-10-16 19:07               ` Matthew Wilcox
2021-10-18 17:25                 ` Johannes Weiner
2021-09-21 22:18           ` Folio discussion recap Matthew Wilcox
2021-09-23  0:45             ` Ira Weiny
2021-09-23  3:41               ` Matthew Wilcox
2021-09-23 22:12                 ` Ira Weiny
2021-09-29 15:24                   ` Matthew Wilcox
2021-09-21 21:59         ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YW2lKcqwBZGDCz6T@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).