nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Huaisheng Ye <yehs2007@zoho.com>
To: Jan Kara <jack@suse.cz>
Cc: Mike Snitzer <snitzer@redhat.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	chengnt <chengnt@lenovo.com>, Dave Chinner <david@fromorbit.com>,
	colyli <colyli@suse.de>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: Snapshot target and DAX-capable devices
Date: Thu, 13 Dec 2018 00:11:46 +0800	[thread overview]
Message-ID: <167a3303a01.11a848ab768799.5161498967766415143@zoho.com> (raw)
In-Reply-To: <20180831094255.GB11622@quack2.suse.cz>

 ---- On Fri, 31 Aug 2018 17:42:55 +0800 Jan Kara <jack@suse.cz> wrote ---- 
 > On Fri 31-08-18 09:38:09, Dave Chinner wrote: 
 > > On Thu, Aug 30, 2018 at 03:47:32PM -0400, Mikulas Patocka wrote: 
 > > >  
 > > >  
 > > > On Thu, 30 Aug 2018, Jeff Moyer wrote: 
 > > >  
 > > > > Mike Snitzer <snitzer@redhat.com> writes: 
 > > > >  
 > > > > > Until we properly add DAX support to dm-snapshot I'm afraid we really do 
 > > > > > need to tolerate this "regression".  Since reality is the original 
 > > > > > support for snapshot of a DAX DM device never worked in a robust way. 
 > > > >  
 > > > > Agreed. 
 > > > >  
 > > > > -Jeff 
 > > >  
 > > > You can't support dax on snapshot - if someone maps a block and the block  
 > > > needs to be moved, then what? 
 > >  
 > > This is only a problem for access via mmap and page faults. 
 > >  
 > > At the filesystem level, it's no different to the existing direct IO 
 > > algorithm for read/write IO - we simply allocate new space, copy the 
 > > data we need to copy into the new space (may be no copy needed), and 
 > > then write the new data into the new space. I'm pretty sure that for 
 > > bio-based IO to dm-snapshot devices the algorithm will be exactly 
 > > the same. 
 > >  
 > > However, for direct access via mmap, we have to modify how the 
 > > userspace virtual address is mapped to the physical location. IOWs, 
 > > during the COW operation, we have to invalidate all existing user 
 > > mappings we have for that physical address. This means we have to do 
 > > an invalidation after the allocate/copy part of the COW operation. 
 > > 
 > > If we are doing this during a page fault, it means we'll probably 
 > > have to restart the page fault so it can look up the new physical 
 > > address associated with the faulting user address. After we've done 
 > > the invalidation, any new (or restarted) page fault finds the 
 > > location of new copy we just made, maps it into the user address 
 > > space, updates the ptes and we're all good. 
 > >  
 > > Well, that's the theory. We haven't implemented this for XFS yet, so 
 > > it might end up a little different, and we might yet hit unexpected 
 > > problems (it's DAX, that's what happens :/). 
 >  
 > Yes, that's outline of a plan :) 
 >  
 > > It's a whole different ballgame for a dm-snapshot device - block 
 > > devices are completely unaware of page faults to DAX file mappings. 
 >  
 > Actually, block devices are not completely unaware of DAX page faults - 
 > they will get ->direct_access callback for the fault range. It does not 
 > currently convey enough information - we also need to inform the block 
 > device whether it is read or write. But that's about all that's needed to 
 > add AFAICT. And by comparing returned PFN with the one we have stored in 
 > the radix tree (which we have if that file offset is mapped by anybody), 
 > the filesystem / DAX code can tell whether remapping happened and do the 
 > unmapping. 

Hi Jan,

I am trying to investigate how to make dm-snapshot to support DAX, and I
dropped a patchset to upstream for comments. Any suggestion is welcome.
# https://lkml.org/lkml/2018/11/21/281

In the beginning, I haven't considered the situation of mmap write faults.
>From Dan's reply and this email thread, now I have a more clear understanding.

The question is that, even the virtual dm block device has been informed that
the mmap may have write operations through PROT_WRITE, if userspace directly
operate the virtual address of origin device like memcpy, dm-snapshot doesn't
have chance to detect this behavior.
Although dm-snapshot can have chance to prepare a COW area to back up origin's
blocks within ->direct_access callback for the fault range, how can it to have
opportunity to read the data from origin device and save it to COW?

---
Cheers,
Huaisheng Ye

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

  parent reply	other threads:[~2018-12-12 16:11 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-27 16:07 Snapshot target and DAX-capable devices Jan Kara
2018-08-27 16:43 ` Kani, Toshi
2018-08-28  7:50   ` Jan Kara
2018-08-28 17:56     ` Mike Snitzer
2018-08-28 22:38       ` Kani, Toshi
2018-08-30  9:30       ` Jan Kara
2018-08-30 18:49         ` Mike Snitzer
2018-08-30 19:32           ` Jeff Moyer
2018-08-30 19:47             ` Mikulas Patocka
2018-08-30 19:53               ` Jeff Moyer
2018-08-30 23:38               ` Dave Chinner
2018-08-31  9:42                 ` Jan Kara
2018-09-05  1:25                   ` Dave Chinner
2018-12-12 16:11                   ` Huaisheng Ye [this message]
2018-12-12 16:12                     ` Christoph Hellwig
2018-12-12 17:50                       ` Mike Snitzer
2018-12-12 19:49                         ` Kani, Toshi
2018-12-12 21:15                         ` Theodore Y. Ts'o
2018-12-12 22:43                           ` Mike Snitzer
2018-12-14  4:11                             ` [dm-devel] " Theodore Y. Ts'o
2018-12-14  8:24                             ` [External] " Huaisheng HS1 Ye
2018-12-18 19:49                               ` Mike Snitzer
2018-08-30 19:44           ` Mikulas Patocka
2018-08-31 10:01             ` Jan Kara
2018-08-30 22:55           ` Dave Chinner
2018-08-31  9:54           ` Jan Kara
2018-08-30 19:17         ` [dm-devel] " Jeff Moyer
2018-08-31  9:14           ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=167a3303a01.11a848ab768799.5161498967766415143@zoho.com \
    --to=yehs2007@zoho.com \
    --cc=chengnt@lenovo.com \
    --cc=colyli@suse.de \
    --cc=david@fromorbit.com \
    --cc=dm-devel@redhat.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).