From: Matthew Wilcox <willy@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jens Axboe <axboe@kernel.dk>, Linux-MM <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-block <linux-block@vger.kernel.org>,
Chris Mason <clm@fb.com>, Dave Chinner <david@fromorbit.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCHSET v3 0/5] Support for RWF_UNCACHED
Date: Thu, 12 Dec 2019 12:05:08 -0800 [thread overview]
Message-ID: <20191212200508.GU32169@bombadil.infradead.org> (raw)
In-Reply-To: <CAHk-=wh4J91wMrEU12DP1r+rLiThQ6wDBb+UOzOuMDkusxtdhw@mail.gmail.com>
On Thu, Dec 12, 2019 at 10:29:02AM -0800, Linus Torvalds wrote:
> On Thu, Dec 12, 2019 at 9:52 AM Matthew Wilcox <willy@infradead.org> wrote:
> > 1. We could semi-sort the pages on the LRU list. If we know we're going
> > to remove a bunch of pages, we could take a batch of them off the list,
> > sort them and remove them in-order. This probably wouldn't be terribly
> > effective.
>
> I don't think the sorting is relevant.
>
> Once you batch things, you already would get most of the locality
> advantage in the cache if it exists (and the batch isn't insanely
> large so that one batch already causes cache overflows).
>
> The problem - I suspect - is that we don't batch at all. Or rather,
> the "batching" does exist at a high level, but it's so high that
> there's just tons of stuff going on between single pages. It is at the
> shrink_page_list() level, which is pretty high up and basically does
> one page at a time with locking and a lot of tests for each page, and
> then we do "__remove_mapping()" (which does some more work) one at a
> time before we actually get to __delete_from_page_cache().
>
> So it's "batched", but it's in a huge loop, and even at that huge loop
> level the batch size is fairly small. We limit it to SWAP_CLUSTER_MAX,
> which is just 32.
>
> Thinking about it, that SWAP_CLUSTER_MAX may make sense in some other
> circumstances, but not necessarily in the "shrink clean inactive
> pages" thing. I wonder if we could just batch clean pages a _lot_ more
> aggressively. Yes, our batching loop is still very big and it might
> not help at an L1 level, but it might help in the L2, at least.
>
> In kswapd, when we have 28 GB of pages on the inactive list, a batch
> of 32 pages at a time is pretty small ;)
Yeah, that's pretty poor. I just read through it, and even if pages are
in order on the page list, they're not going to batch nicely. It'd be
nice to accumulate them and call delete_from_page_cache_batch(), but we
need to put shadow entries in to replace them, so we'd need a variant
of that which took two pagevecs.
> > 2. We could change struct page to point to the xa_node that holds them.
> > Looking up the page mapping would be page->xa_node->array and then
> > offsetof(i_pages) to get the mapping.
>
> I don't think we have space in 'struct page', and I'm pretty sure we
> don't want to grow it. That's one of the more common data structures
> in the kernel.
Oh, I wasn't clear. I meant replace page->mapping with page->xa_node.
We could still get from page to mapping, but it would be an extra
dereference. I did say it was a _bad_ idea.
next prev parent reply other threads:[~2019-12-12 20:05 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-11 15:29 [PATCHSET v3 0/5] Support for RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 1/5] fs: add read support " Jens Axboe
2019-12-11 15:29 ` [PATCH 2/5] mm: make generic_perform_write() take a struct kiocb Jens Axboe
2019-12-11 15:29 ` [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Jens Axboe
2019-12-11 15:29 ` [PATCH 4/5] iomap: pass in the write_begin/write_end flags to iomap_actor Jens Axboe
2019-12-11 17:19 ` Linus Torvalds
2019-12-11 15:29 ` [PATCH 5/5] iomap: support RWF_UNCACHED for buffered writes Jens Axboe
2019-12-11 17:19 ` Matthew Wilcox
2019-12-11 18:05 ` Jens Axboe
2019-12-12 22:34 ` Dave Chinner
2019-12-13 0:54 ` Jens Axboe
2019-12-13 0:57 ` Jens Axboe
2019-12-16 4:17 ` Dave Chinner
2019-12-17 14:31 ` Jens Axboe
2019-12-18 0:49 ` Dave Chinner
2019-12-18 1:01 ` Jens Axboe
2019-12-11 17:37 ` [PATCHSET v3 0/5] Support for RWF_UNCACHED Linus Torvalds
2019-12-11 17:56 ` Jens Axboe
2019-12-11 19:14 ` Linus Torvalds
2019-12-11 19:34 ` Jens Axboe
2019-12-11 20:03 ` Linus Torvalds
2019-12-11 20:08 ` Jens Axboe
2019-12-11 20:18 ` Linus Torvalds
2019-12-11 21:04 ` Johannes Weiner
2019-12-12 1:30 ` Jens Axboe
2019-12-11 23:41 ` Jens Axboe
2019-12-12 1:08 ` Linus Torvalds
2019-12-12 1:11 ` Jens Axboe
2019-12-12 1:22 ` Linus Torvalds
2019-12-12 1:29 ` Jens Axboe
2019-12-12 1:41 ` Linus Torvalds
2019-12-12 1:56 ` Matthew Wilcox
2019-12-12 2:47 ` Linus Torvalds
2019-12-12 17:52 ` Matthew Wilcox
2019-12-12 18:29 ` Linus Torvalds
2019-12-12 20:05 ` Matthew Wilcox [this message]
2019-12-12 1:41 ` Jens Axboe
2019-12-12 1:49 ` Linus Torvalds
2019-12-12 1:09 ` Jens Axboe
2019-12-12 2:03 ` Jens Axboe
2019-12-12 2:10 ` Jens Axboe
2019-12-12 2:21 ` Matthew Wilcox
2019-12-12 2:38 ` Jens Axboe
2019-12-12 22:18 ` Dave Chinner
2019-12-13 1:32 ` Chris Mason
2020-01-07 17:42 ` Christoph Hellwig
2020-01-08 14:09 ` Chris Mason
2020-02-01 10:33 ` Andres Freund
2019-12-11 20:43 ` Matthew Wilcox
2019-12-11 20:04 ` Jens Axboe
2019-12-12 10:44 ` Martin Steigerwald
2019-12-12 15:16 ` Jens Axboe
2019-12-12 21:45 ` Martin Steigerwald
2019-12-12 22:15 ` Jens Axboe
2019-12-12 22:18 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191212200508.GU32169@bombadil.infradead.org \
--to=willy@infradead.org \
--cc=axboe@kernel.dk \
--cc=clm@fb.com \
--cc=david@fromorbit.com \
--cc=hannes@cmpxchg.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).