linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Andreas Dilger <adilger@dilger.ca>,
	Johannes Weiner <hannes@cmpxchg.org>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: Discontiguous folios/pagesets
Date: Mon, 30 Aug 2021 19:43:01 +0100	[thread overview]
Message-ID: <YS0mtYZ+PEAaM7pI@casper.infradead.org> (raw)
In-Reply-To: <20210830182818.GA9892@magnolia>

On Mon, Aug 30, 2021 at 11:28:18AM -0700, Darrick J. Wong wrote:
> On Sat, Aug 28, 2021 at 01:27:29PM -0600, Andreas Dilger wrote:
> > On Aug 28, 2021, at 1:04 PM, Matthew Wilcox <willy@infradead.org> wrote:
> > > 
> > > The current folio work is focused on permitting the VM to use
> > > physically contiguous chunks of memory.  Both Darrick and Johannes
> > > have pointed out the advantages of supporting logically-contiguous,
> > > physically-discontiguous chunks of memory.  Johannes wants to be able to
> > > use order-0 allocations to allocate larger folios, getting the benefit
> > > of managing the memory in larger chunks without requiring the memory
> > > allocator to be able to find contiguous chunks.  Darrick wants to support
> > > non-power-of-two block sizes.
> > 
> > What is the use case for non-power-of-two block sizes?  The main question
> > is whether that use case is important enough to add the complexity and
> > overhead in order to support it?
> 
> For copy-on-write to a XFS realtime volume where the allocation extent
> size (we support bigalloc too! :P) is not a power of two (e.g. you set
> up a 4 disk raid5 with 64k stripes, now the extent size is 192k).
> 
> Granted, I don't think folios handling 192k chunks is absolutely
> *required* for folios; the only hard requirement is that if any page in
> a 192k extent becomes dirty, the rest have to get written out all the
> same time, and the cow remap can only happen after the last page
> finishes writeback.

I /think/ "all pages get written out at the same time" is basically the
same thing as "support a non-power-of-two block size".

If we only have page A in the cache at the time it's going to be written
back, we have to read in pages B and C in order to calculate the parity P.
That will annoy writeback-because-we're-low-on-memory; I know we allow
a certain amount of allocation to happen in the writeback path, but
requiring 128kB to be allocated is a bit much.

So we have to allow page A being dirty to pin pages B and C in the cache.
I suppose that's possible; we could make (clean) pages B and C follow
page A on the LRU, so they're going to still be in RAM at the time that
page A is written back.  I don't fully understand how the LRU works,
but I assume it'd be a nightmare to ensure that A, B and C all move
around the system in the same way.  Much easier to ensure that ABC stay
linked together and all get written back at once.


  parent reply	other threads:[~2021-08-30 18:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-28 19:04 Discontiguous folios/pagesets Matthew Wilcox
2021-08-28 19:27 ` Andreas Dilger
2021-08-30 18:28   ` Darrick J. Wong
2021-08-30 18:35     ` Andreas Dilger
2021-08-30 18:43     ` Matthew Wilcox [this message]
2021-09-01  9:40 ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YS0mtYZ+PEAaM7pI@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=adilger@dilger.ca \
    --cc=darrick.wong@oracle.com \
    --cc=djwong@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).