From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752147AbcKFOf1 (ORCPT ); Sun, 6 Nov 2016 09:35:27 -0500 Received: from gateway22.websitewelcome.com ([192.185.46.126]:56652 "EHLO gateway22.websitewelcome.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751830AbcKFOd4 (ORCPT ); Sun, 6 Nov 2016 09:33:56 -0500 X-Greylist: delayed 1486 seconds by postgrey-1.27 at vger.kernel.org; Sun, 06 Nov 2016 09:33:56 EST Message-ID: In-Reply-To: <20161025211903.GD14023@dastard> References: <1476826937-20665-1-git-send-email-sbates@raithlin.com> <20161019184814.GC16550@cgy1-donard.priv.deltatee.com> <20161020232239.GQ23194@dastard> <20161021095714.GA12209@infradead.org> <20161021111253.GQ14023@dastard> <20161025115043.GA14986@cgy1-donard.priv.deltatee.com> <20161025211903.GD14023@dastard> Date: Sun, 6 Nov 2016 08:05:59 -0600 Subject: Re: [PATCH 0/3] iopmem : A block device for PCIe memory From: "Stephen Bates" To: "Dave Chinner" Cc: "Stephen Bates" , "Christoph Hellwig" , "Dan Williams" , "linux-kernel@vger.kernel.org" , "linux-nvdimm@lists.01.org" , linux-rdma@vger.kernel.org, linux-block@vger.kernel.org, "Linux MM" , "Ross Zwisler" , "Matthew Wilcox" , jgunthorpe@obsidianresearch.com, haggaie@mellanox.com, "Jens Axboe" , "Jonathan Corbet" , jim.macdonald@everspin.com, sbates@raithin.com, "Logan Gunthorpe" , "David Woodhouse" , "Raj, Ashok" User-Agent: SquirrelMail/1.5.2 [SVN] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7BIT X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - estate.websitewelcome.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [1547 32008] / [47 12] X-AntiAbuse: Sender Address Domain - raithlin.com X-BWhitelist: no X-Source-IP: X-Exim-ID: 1c3O59-000PM3-BQ X-Source: X-Source-Args: /usr/local/cpanel/3rdparty/php/54/bin/php-cgi /usr/local/cpanel/base/3rdparty/squirrelmail/src/compose.php X-Source-Dir: /usr/local/cpanel/base/3rdparty/squirrelmail/src X-Source-Sender: X-Source-Auth: raithlin X-Email-Count: 16 X-Source-Cap: cmFpdGhsaW47c2NvdHQ7ZXN0YXRlLndlYnNpdGV3ZWxjb21lLmNvbQ== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, October 25, 2016 3:19 pm, Dave Chinner wrote: > On Tue, Oct 25, 2016 at 05:50:43AM -0600, Stephen Bates wrote: >> >> Dave are you saying that even for local mappings of files on a DAX >> capable system it is possible for the mappings to move on you unless the >> FS supports locking? >> > > Yes. > > >> Does that not mean DAX on such FS is >> inherently broken? > > No. DAX is accessed through a virtual mapping layer that abstracts > the physical location from userspace applications. > > Example: think copy-on-write overwrites. It occurs atomically from > the perspective of userspace and starts by invalidating any current > mappings userspace has of that physical location. The location is changes, > the data copied in, and then when the locks are released userspace can > fault in a new page table mapping on the next access.... Dave Thanks for the good input and for correcting some of my DAX misconceptions! We will certainly be taking this into account as we consider v1. > >>>> And at least for XFS we have such a mechanism :) E.g. I have a >>>> prototype of a pNFS layout that uses XFS+DAX to allow clients to do >>>> RDMA directly to XFS files, with the same locking mechanism we use >>>> for the current block and scsi layout in xfs_pnfs.c. >> >> Thanks for fixing this issue on XFS Christoph! I assume this problem >> continues to exist on the other DAX capable FS? > > Yes, but it they implement the exportfs API that supplies this > capability, they'll be able to use pNFS, too. > >> One more reason to consider a move to /dev/dax I guess ;-)... >> > > That doesn't get rid of the need for sane access control arbitration > across all machines that are directly accessing the storage. That's the > problem pNFS solves, regardless of whether your direct access target is a > filesystem, a block device or object storage... Fair point. I am still hoping for a bit more discussion on the best choice of user-space interface for this work. If/When that happens we will take it into account when we look at spinning the patchset. Stephen