linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Stephen Bates <sbates@raithlin.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Dan Williams <dan.j.williams@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	linux-rdma@vger.kernel.org, linux-block@vger.kernel.org,
	Linux MM <linux-mm@kvack.org>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Matthew Wilcox <willy@linux.intel.com>,
	jgunthorpe@obsidianresearch.com, haggaie@mellanox.com,
	Jens Axboe <axboe@fb.com>, Jonathan Corbet <corbet@lwn.net>,
	jim.macdonald@everspin.com, sbates@raithin.com,
	Logan Gunthorpe <logang@deltatee.com>,
	David Woodhouse <dwmw2@infradead.org>,
	"Raj, Ashok" <ashok.raj@intel.com>
Subject: Re: [PATCH 0/3] iopmem : A block device for PCIe memory
Date: Wed, 26 Oct 2016 08:19:03 +1100	[thread overview]
Message-ID: <20161025211903.GD14023@dastard> (raw)
In-Reply-To: <20161025115043.GA14986@cgy1-donard.priv.deltatee.com>

On Tue, Oct 25, 2016 at 05:50:43AM -0600, Stephen Bates wrote:
> Hi Dave and Christoph
> 
> On Fri, Oct 21, 2016 at 10:12:53PM +1100, Dave Chinner wrote:
> > On Fri, Oct 21, 2016 at 02:57:14AM -0700, Christoph Hellwig wrote:
> > > On Fri, Oct 21, 2016 at 10:22:39AM +1100, Dave Chinner wrote:
> > > > You do realise that local filesystems can silently change the
> > > > location of file data at any point in time, so there is no such
> > > > thing as a "stable mapping" of file data to block device addresses
> > > > in userspace?
> > > >
> > > > If you want remote access to the blocks owned and controlled by a
> > > > filesystem, then you need to use a filesystem with a remote locking
> > > > mechanism to allow co-ordinated, coherent access to the data in
> > > > those blocks. Anything else is just asking for ongoing, unfixable
> > > > filesystem corruption or data leakage problems (i.e.  security
> > > > issues).
> > >
> 
> Dave are you saying that even for local mappings of files on a DAX
> capable system it is possible for the mappings to move on you unless
> the FS supports locking?

Yes.

> Does that not mean DAX on such FS is
> inherently broken?

No. DAX is accessed through a virtual mapping layer that abstracts
the physical location from userspace applications.

Example: think copy-on-write overwrites. It occurs atomically from
the perspective of userspace and starts by invalidating any current
mappings userspace has of that physical location. The location is
changes, the data copied in, and then when the locks are released
userspace can fault in a new page table mapping on the next
access....

> > > And at least for XFS we have such a mechanism :)  E.g. I have a
> > > prototype of a pNFS layout that uses XFS+DAX to allow clients to do
> > > RDMA directly to XFS files, with the same locking mechanism we use
> > > for the current block and scsi layout in xfs_pnfs.c.
> 
> Thanks for fixing this issue on XFS Christoph! I assume this problem
> continues to exist on the other DAX capable FS?

Yes, but it they implement the exportfs API that supplies this
capability, they'll be able to use pNFS, too.

> One more reason to consider a move to /dev/dax I guess ;-)...

That doesn't get rid of the need for sane access control arbitration
across all machines that are directly accessing the storage. That's
the problem pNFS solves, regardless of whether your direct access
target is a filesystem, a block device or object storage...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2016-10-25 21:19 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-18 21:42 [PATCH 0/3] iopmem : A block device for PCIe memory Stephen Bates
2016-10-18 21:42 ` [PATCH 1/3] memremap.c : Add support for ZONE_DEVICE IO memory with struct pages Stephen Bates
2016-10-19 17:50   ` Dan Williams
2016-10-19 18:40     ` Stephen Bates
2016-10-19 20:01       ` Dan Williams
2016-10-25 11:54         ` Stephen Bates
2016-10-18 21:42 ` [PATCH 2/3] iopmem : Add a block device driver for PCIe attached IO memory Stephen Bates
2016-10-28  6:45   ` Christoph Hellwig
2016-10-28 19:22     ` Logan Gunthorpe
2016-10-18 21:42 ` [PATCH 3/3] iopmem : Add documentation for iopmem driver Stephen Bates
2016-10-28  6:46   ` Christoph Hellwig
2016-10-19  3:51 ` [PATCH 0/3] iopmem : A block device for PCIe memory Dan Williams
2016-10-19 18:48   ` Stephen Bates
2016-10-19 19:58     ` Dan Williams
2016-10-19 22:54       ` Stephen Bates
2016-10-20 23:22     ` Dave Chinner
2016-10-21  9:57       ` Christoph Hellwig
2016-10-21 11:12         ` Dave Chinner
2016-10-25 11:50           ` Stephen Bates
2016-10-25 21:19             ` Dave Chinner [this message]
2016-11-06 14:05               ` Stephen Bates
2016-10-27 10:22         ` Sagi Grimberg
2016-10-27 12:32           ` Christoph Hellwig
2016-10-26  8:24   ` Haggai Eran
2016-10-26 13:39     ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161025211903.GD14023@dastard \
    --to=david@fromorbit.com \
    --cc=ashok.raj@intel.com \
    --cc=axboe@fb.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dwmw2@infradead.org \
    --cc=haggaie@mellanox.com \
    --cc=hch@infradead.org \
    --cc=jgunthorpe@obsidianresearch.com \
    --cc=jim.macdonald@everspin.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=sbates@raithin.com \
    --cc=sbates@raithlin.com \
    --cc=willy@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).