linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: Ruan Shiyang <ruansy.fnst@cn.fujitsu.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"hch@lst.de" <hch@lst.de>, "rgoldwyn@suse.de" <rgoldwyn@suse.de>,
	"Qi, Fuli" <qi.fuli@fujitsu.com>,
	"Gotou, Yasunori" <y-goto@fujitsu.com>
Subject: Re: 回复: Re: [RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink
Date: Tue, 28 Apr 2020 21:24:41 +1000	[thread overview]
Message-ID: <20200428112441.GH2040@dread.disaster.area> (raw)
In-Reply-To: <20200428111636.GK29705@bombadil.infradead.org>

On Tue, Apr 28, 2020 at 04:16:36AM -0700, Matthew Wilcox wrote:
> On Tue, Apr 28, 2020 at 05:32:41PM +0800, Ruan Shiyang wrote:
> > On 2020/4/28 下午2:43, Dave Chinner wrote:
> > > On Tue, Apr 28, 2020 at 06:09:47AM +0000, Ruan, Shiyang wrote:
> > > > 在 2020/4/27 20:28:36, "Matthew Wilcox" <willy@infradead.org> 写道:
> > > > > On Mon, Apr 27, 2020 at 04:47:42PM +0800, Shiyang Ruan wrote:
> > > > > >   This patchset is a try to resolve the shared 'page cache' problem for
> > > > > >   fsdax.
> > > > > > 
> > > > > >   In order to track multiple mappings and indexes on one page, I
> > > > > >   introduced a dax-rmap rb-tree to manage the relationship.  A dax entry
> > > > > >   will be associated more than once if is shared.  At the second time we
> > > > > >   associate this entry, we create this rb-tree and store its root in
> > > > > >   page->private(not used in fsdax).  Insert (->mapping, ->index) when
> > > > > >   dax_associate_entry() and delete it when dax_disassociate_entry().
> > > > > 
> > > > > Do we really want to track all of this on a per-page basis?  I would
> > > > > have thought a per-extent basis was more useful.  Essentially, create
> > > > > a new address_space for each shared extent.  Per page just seems like
> > > > > a huge overhead.
> > > > > 
> > > > Per-extent tracking is a nice idea for me.  I haven't thought of it
> > > > yet...
> > > > 
> > > > But the extent info is maintained by filesystem.  I think we need a way
> > > > to obtain this info from FS when associating a page.  May be a bit
> > > > complicated.  Let me think about it...
> > > 
> > > That's why I want the -user of this association- to do a filesystem
> > > callout instead of keeping it's own naive tracking infrastructure.
> > > The filesystem can do an efficient, on-demand reverse mapping lookup
> > > from it's own extent tracking infrastructure, and there's zero
> > > runtime overhead when there are no errors present.
> > > 
> > > At the moment, this "dax association" is used to "report" a storage
> > > media error directly to userspace. I say "report" because what it
> > > does is kill userspace processes dead. The storage media error
> > > actually needs to be reported to the owner of the storage media,
> > > which in the case of FS-DAX is the filesytem.
> > 
> > Understood.
> > 
> > BTW, this is the usage in memory-failure, so what about rmap?  I have not
> > found how to use this tracking in rmap.  Do you have any ideas?
> > 
> > > 
> > > That way the filesystem can then look up all the owners of that bad
> > > media range (i.e. the filesystem block it corresponds to) and take
> > > appropriate action. e.g.
> > 
> > I tried writing a function to look up all the owners' info of one block in
> > xfs for memory-failure use.  It was dropped in this patchset because I found
> > out that this lookup function needs 'rmapbt' to be enabled when mkfs.  But
> > by default, rmapbt is disabled.  I am not sure if it matters...
> 
> I'm pretty sure you can't have shared extents on an XFS filesystem if you
> _don't_ have the rmapbt feature enabled.  I mean, that's why it exists.

You're confusing reflink with rmap. :)

rmapbt does all the reverse mapping tracking, reflink just does the
shared data extent tracking.

But given that anyone who wants to use DAX with reflink is going to
have to mkfs their filesystem anyway (to turn on reflink) requiring
that rmapbt is also turned on is not a big deal. Especially as we
can check it at mount time in the kernel...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2020-04-28 11:24 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-27  8:47 [RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 1/8] fs/dax: Introduce dax-rmap btree for reflink Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 2/8] mm: add dax-rmap for memory-failure and rmap Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 3/8] fs/dax: Introduce dax_copy_edges() for COW Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 4/8] fs/dax: copy data before write Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 5/8] fs/dax: replace mmap entry in case of CoW Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 6/8] fs/dax: dedup file range to use a compare function Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 7/8] fs/xfs: handle CoW for fsdax write() path Shiyang Ruan
2020-04-27  8:47 ` [RFC PATCH 8/8] fs/xfs: support dedupe for fsdax Shiyang Ruan
2020-04-27 12:28 ` [RFC PATCH 0/8] dax: Add a dax-rmap tree to support reflink Matthew Wilcox
2020-04-28  6:09   ` 回复: " Ruan, Shiyang
2020-04-28  6:43     ` Dave Chinner
2020-04-28  9:32       ` Ruan Shiyang
2020-04-28 11:16         ` Matthew Wilcox
2020-04-28 11:24           ` Dave Chinner [this message]
2020-04-28 15:37             ` Darrick J. Wong
2020-04-28 22:02               ` Dave Chinner
2020-06-04  7:37       ` Ruan Shiyang
2020-06-04 14:51         ` Darrick J. Wong
2020-06-05  1:30           ` Dave Chinner
2020-06-05  2:30             ` Ruan Shiyang
2020-06-05  2:11           ` Ruan Shiyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200428112441.GH2040@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=qi.fuli@fujitsu.com \
    --cc=rgoldwyn@suse.de \
    --cc=ruansy.fnst@cn.fujitsu.com \
    --cc=willy@infradead.org \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).