All of lore.kernel.org
 help / color / mirror / Atom feed
From: Goldwyn Rodrigues <rgoldwyn@suse.de>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [RFC PATCH 0/5] Shared memory for shared extents
Date: Mon, 25 Oct 2021 11:43:35 -0500	[thread overview]
Message-ID: <20211025164335.t7he6miollf6un2j@fiona> (raw)
In-Reply-To: <YXbQm6TxaWcLnpal@casper.infradead.org>

On 16:43 25/10, Matthew Wilcox wrote:
> On Mon, Oct 25, 2021 at 09:53:01AM -0500, Goldwyn Rodrigues wrote:
> > On  2:43 23/10, Matthew Wilcox wrote:
> > > On Fri, Oct 22, 2021 at 03:15:00PM -0500, Goldwyn Rodrigues wrote:
> > > > This is an attempt to reduce the memory footprint by using a shared
> > > > page(s) for shared extent(s) in the filesystem. I am hoping to start a
> > > > discussion to iron out the details for implementation.
> > > 
> > > When you say "Shared extents", you mean reflinks, which are COW, right?
> > 
> > Yes, shared extents are extents which are shared on disk by two or more
> > files. Yes, same as reflinks. Just to explain with an example:
> > 
> > If two files, f1 and f2 have shared extent(s), and both files are read. Each
> > file's mapping->i_pages will hold a copy of the contents of the shared
> > extent on disk. So, f1->mapping will have one copy and f2->mapping will
> > have another copy.
> > 
> > For reads (and only reads), if we use underlying device's mapping, we
> > can save on duplicate copy of the pages.
> 
> Yes; I'm familiar with the problem.  Dave Chinner and I had a great
> discussion about it at LCA a couple of years ago.
> 
> The implementation I've had in mind for a while is that the filesystem
> either creates a separate inode for a shared extent, or (as you've
> done here) uses the bdev's inode.  We can discuss the pros/cons of
> that separately.
> 
> To avoid the double-lookup problem, I was intending to generalise DAX
> entries into PFN entries.  That way, if the read() (or mmap read fault)
> misses in the inode's cache, we can look up the shared extent cache,
> and then cache the physical address of the memory in the inode.

I am not sure I understand. Could you provide an example? Would this be
specific to DAX? What about standard block devices?

> 
> That makes reclaim/eviction of the page in the shared extent more
> expensive because you have to iterate all the inodes which share the
> extent and remove the PFN entries before the page can be reused.

Not sure of this, but won't it complicate things if there are different
shared extents in different files? Say shared extent SE1 belongs to f1
and f2, where as SE2 belongs to f2 and f3?

> 
> Perhaps we should have a Zoom meeting about this before producing duelling
> patch series?  I can host if you're interested.

Yes, I think that would be nice. I am in the central US Timezone.
If possible, I would like to add David Disseldorp who is based in
Germany.

-- 
Goldwyn

      reply	other threads:[~2021-10-25 16:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-22 20:15 [RFC PATCH 0/5] Shared memory for shared extents Goldwyn Rodrigues
2021-10-22 20:15 ` [RFC PATCH 1/5] mm: Use file parameter to determine bdi Goldwyn Rodrigues
2021-10-22 20:15 ` [RFC PATCH 2/5] mm: Switch mapping to device mapping Goldwyn Rodrigues
2021-10-23  1:36   ` Matthew Wilcox
2021-10-22 20:15 ` [RFC PATCH 3/5] btrfs: Add sharedext mount option Goldwyn Rodrigues
2021-10-22 20:15 ` [RFC PATCH 4/5] btrfs: Set s_bdev for btrfs super block Goldwyn Rodrigues
2021-10-22 20:15 ` [RFC PATCH 5/5] btrfs: function to convert file offset to device offset Goldwyn Rodrigues
2021-10-23  1:43 ` [RFC PATCH 0/5] Shared memory for shared extents Matthew Wilcox
2021-10-25 14:53   ` Goldwyn Rodrigues
2021-10-25 15:43     ` Matthew Wilcox
2021-10-25 16:43       ` Goldwyn Rodrigues [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211025164335.t7he6miollf6un2j@fiona \
    --to=rgoldwyn@suse.de \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.