Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>,
	xfs@oss.sgi.com, linux-fsdevel@vger.kernel.org,
	linux-nvdimm@ml01.01.org
Subject: Re: xfs: untangle the direct I/O and DAX path, fix DAX locking
Date: Tue, 28 Jun 2016 15:10:59 +0200
Message-ID: <20160628131059.GA30475@lst.de> (raw)
In-Reply-To: <20160624230045.GG12670@dastard>

On Sat, Jun 25, 2016 at 09:00:45AM +1000, Dave Chinner wrote:
> > 
> > Sorry, but this is simply broken - allowing apps to opt-in behavior
> > (e.g. like we're using O_DIRECT) is always fine.  Requriring
> > filesystem-specific tuning that has affect outside the app to get
> > existing documented behavior is not how to design APIs.
> 
> Using DAX is an *admin decision*, not an application decision.

Of course - that's exactly my point.

> Indeed, it's a mount option right now, and that's most definitely not
> something the application can turn on or off! Inode flags allow the
> admin to decide that two apps working on the same filesystem can use
> (or not use) DAX independently, rather than needing to put them on
> different filesystems.

Right.  And an existing application can get DAX turned on under its
back, and will now suddently get different synchronization behavior.
That is if it's writes happen to be aligned to the fs block size.

> > Maybe we'll need to opt-in to use DAX for mmap, but giving the same
> > existing behavior for read and write and avoiding a copy to the pagecache
> > is an obvious win.
> 
> You can't use DAX just for mmap. It's an inode scope behaviour -
> once it's turned on, all accesses to that inode - regardless of user
> interface - must use DAX. It's all or nothing, not a per file
> descript/mmap context option.

Right now it is.  But when discussing mmap behavior one option was to
require an opt-in to get DAX-specific mmap semantics.  For plain
read/write we have no such option and thus absolutely need to behave as
all normal reads and writes behave.  If you think the exclusive lock
for writes hurts we have two options:
 
 a) implement range locks (although they might be more expensive for
    typical loads)
 b) add a new O_* or RWF_* option to not require the synchronization
    for apps that don't want it.

Neither of those cases really is DAX-specific.

  reply index

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-22 15:27 Christoph Hellwig
2016-06-22 15:27 ` [PATCH 1/8] xfs: don't pass ioflags around in the ioctl path Christoph Hellwig
2016-06-22 15:27 ` [PATCH 2/8] xfs: kill ioflags Christoph Hellwig
2016-06-22 15:27 ` [PATCH 3/8] xfs: remove s_maxbytes enforcement in xfs_file_read_iter Christoph Hellwig
2016-06-22 15:27 ` [PATCH 4/8] xfs: split xfs_file_read_iter into buffered and direct I/O helpers Christoph Hellwig
2016-06-22 15:27 ` [PATCH 5/8] xfs: stop using generic_file_read_iter for direct I/O Christoph Hellwig
2016-06-22 15:27 ` [PATCH 6/8] xfs: direct calls in the direct I/O path Christoph Hellwig
2016-06-22 15:27 ` [PATCH 7/8] xfs: split direct I/O and DAX path Christoph Hellwig
2016-09-29  2:53   ` Darrick J. Wong
2016-09-29  8:38     ` aio completions vs file_accessed race, was: " Christoph Hellwig
2016-09-29 20:18       ` Christoph Hellwig
2016-09-29 20:18         ` Christoph Hellwig
2016-09-29 20:33           ` Darrick J. Wong
2016-06-22 15:27 ` [PATCH 8/8] xfs: fix locking for DAX writes Christoph Hellwig
2016-06-23 14:22   ` Boaz Harrosh
2016-06-23 23:24 ` xfs: untangle the direct I/O and DAX path, fix DAX locking Dave Chinner
2016-06-24  1:14   ` Dan Williams
2016-06-24  7:13     ` Dave Chinner
2016-06-24  7:31       ` Christoph Hellwig
2016-06-24  7:26   ` Christoph Hellwig
2016-06-24 23:00     ` Dave Chinner
2016-06-28 13:10       ` Christoph Hellwig [this message]
2016-06-28 13:27         ` Boaz Harrosh
2016-06-28 13:39           ` Christoph Hellwig
2016-06-28 13:56             ` Boaz Harrosh
2016-06-28 15:39               ` Christoph Hellwig
2016-06-29 12:23                 ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160628131059.GA30475@lst.de \
    --to=hch@lst.de \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@ml01.01.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git