All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.com>, Matthew Wilcox <willy@linux.intel.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	XFS Developers <xfs@oss.sgi.com>, jmoyer <jmoyer@redhat.com>
Subject: Re: [PATCH 2/2] dax: move writeback calls into the filesystems
Date: Mon, 8 Feb 2016 12:55:24 -0800	[thread overview]
Message-ID: <CAPcyv4iHi17pv_VC=WgEP4_GgN9OvSr8xbw1bvbEFMiQ83GbWw@mail.gmail.com> (raw)
In-Reply-To: <20160208201808.GK27429@dastard>

On Mon, Feb 8, 2016 at 12:18 PM, Dave Chinner <david@fromorbit.com> wrote:
[..]
>> Setting aside the current block zeroing problem you seem to assuming
>> that DAX will always be faster and that may not be true at a media
>> level.  Waiting years for some applications to determine if DAX makes
>> sense for their use case seems completely reasonable.  In the meantime
>> the apps that are already making these changes want to know that a DAX
>> mapping request has not silently dropped backed to page cache.  They
>> also want to know if they successfully jumped through all the hoops to
>> get a larger than pte mapping.
>>
>> I agree it is useful to be able to force DAX on an unmodified
>> application to see what happens, and it follows that if those
>> applications want to run in that mode they will need functional
>> fsync()...
>>
>> I would feel better if we were talking about specific applications and
>> performance numbers to know if forcing DAX on application is a debug
>> facility or a production level capability.  You seem to have already
>> made that determination and I'm curious what I'm missing.
>
> I'm not setting any policy here at all.  This whole argument is
> based around the DAX mount option doing "global fs enable or
> silently turning it off" and the application not knowing about that.
>
> The whole point of having a persistent per-inode DAX flags is that
> it is a policy mechanism, not a policy.  The application can, if it
> is DAX aware, directly control whether DAX is used on a file or not.
> The application can even query and clear that persistent inode flag
> if it is configured not to (or cannot) use DAX.
>
> If the filesystem cannot support DAX, then we can error out attempts
> to set the DAX flag and then the app knows DAX is not available.
> i.e. the attempt to set policy failed. If the flag is set, then the
> inode will *always* use DAX - there is no "fall back to page cache"
> when DAX is enabled.
>
> If the applicaiton is not DAX aware, then the admin can control the
> DAX policy by manipulating these flags themselves, and hence control
> whether DAX is used by the application or not.
>
> If you think I'm dictating policy for DAX users and application,
> then you haven't understood anything I've previously said about why
> the DAX mount option needs to die before any of this is considered
> production ready. DAX is not an opaque "all or nothing" option. XFS
> will provide apps and admins with fine-grained, persistent,
> discoverable policy flags to allow admins and applications to set
> DAX policies however they see fit. This simply cannot be done if the
> only knob you have is a mount option that may or may not stick.

I agree the mount option needs to die, and I fully grok the reasoning.
  What I'm concerned with is that a system using fully-DAX-aware
applications is forced to incur the overhead of maintaining *sync
semantics, periodic sync(2) in particular,  even if it is not relying
on those semantics.

However, like I said in my other mail, we can solve that with
alternate interfaces to persistent memory if that becomes an issue and
not require that "disable *sync" capability to come through DAX.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.com>, Matthew Wilcox <willy@linux.intel.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	XFS Developers <xfs@oss.sgi.com>, jmoyer <jmoyer@redhat.com>
Subject: Re: [PATCH 2/2] dax: move writeback calls into the filesystems
Date: Mon, 8 Feb 2016 12:55:24 -0800	[thread overview]
Message-ID: <CAPcyv4iHi17pv_VC=WgEP4_GgN9OvSr8xbw1bvbEFMiQ83GbWw@mail.gmail.com> (raw)
In-Reply-To: <20160208201808.GK27429@dastard>

On Mon, Feb 8, 2016 at 12:18 PM, Dave Chinner <david@fromorbit.com> wrote:
[..]
>> Setting aside the current block zeroing problem you seem to assuming
>> that DAX will always be faster and that may not be true at a media
>> level.  Waiting years for some applications to determine if DAX makes
>> sense for their use case seems completely reasonable.  In the meantime
>> the apps that are already making these changes want to know that a DAX
>> mapping request has not silently dropped backed to page cache.  They
>> also want to know if they successfully jumped through all the hoops to
>> get a larger than pte mapping.
>>
>> I agree it is useful to be able to force DAX on an unmodified
>> application to see what happens, and it follows that if those
>> applications want to run in that mode they will need functional
>> fsync()...
>>
>> I would feel better if we were talking about specific applications and
>> performance numbers to know if forcing DAX on application is a debug
>> facility or a production level capability.  You seem to have already
>> made that determination and I'm curious what I'm missing.
>
> I'm not setting any policy here at all.  This whole argument is
> based around the DAX mount option doing "global fs enable or
> silently turning it off" and the application not knowing about that.
>
> The whole point of having a persistent per-inode DAX flags is that
> it is a policy mechanism, not a policy.  The application can, if it
> is DAX aware, directly control whether DAX is used on a file or not.
> The application can even query and clear that persistent inode flag
> if it is configured not to (or cannot) use DAX.
>
> If the filesystem cannot support DAX, then we can error out attempts
> to set the DAX flag and then the app knows DAX is not available.
> i.e. the attempt to set policy failed. If the flag is set, then the
> inode will *always* use DAX - there is no "fall back to page cache"
> when DAX is enabled.
>
> If the applicaiton is not DAX aware, then the admin can control the
> DAX policy by manipulating these flags themselves, and hence control
> whether DAX is used by the application or not.
>
> If you think I'm dictating policy for DAX users and application,
> then you haven't understood anything I've previously said about why
> the DAX mount option needs to die before any of this is considered
> production ready. DAX is not an opaque "all or nothing" option. XFS
> will provide apps and admins with fine-grained, persistent,
> discoverable policy flags to allow admins and applications to set
> DAX policies however they see fit. This simply cannot be done if the
> only knob you have is a mount option that may or may not stick.

I agree the mount option needs to die, and I fully grok the reasoning.
  What I'm concerned with is that a system using fully-DAX-aware
applications is forced to incur the overhead of maintaining *sync
semantics, periodic sync(2) in particular,  even if it is not relying
on those semantics.

However, like I said in my other mail, we can solve that with
alternate interfaces to persistent memory if that becomes an issue and
not require that "disable *sync" capability to come through DAX.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Theodore Ts'o <tytso@mit.edu>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	XFS Developers <xfs@oss.sgi.com>, Linux MM <linux-mm@kvack.org>,
	jmoyer <jmoyer@redhat.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jan Kara <jack@suse.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Matthew Wilcox <willy@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 2/2] dax: move writeback calls into the filesystems
Date: Mon, 8 Feb 2016 12:55:24 -0800	[thread overview]
Message-ID: <CAPcyv4iHi17pv_VC=WgEP4_GgN9OvSr8xbw1bvbEFMiQ83GbWw@mail.gmail.com> (raw)
In-Reply-To: <20160208201808.GK27429@dastard>

On Mon, Feb 8, 2016 at 12:18 PM, Dave Chinner <david@fromorbit.com> wrote:
[..]
>> Setting aside the current block zeroing problem you seem to assuming
>> that DAX will always be faster and that may not be true at a media
>> level.  Waiting years for some applications to determine if DAX makes
>> sense for their use case seems completely reasonable.  In the meantime
>> the apps that are already making these changes want to know that a DAX
>> mapping request has not silently dropped backed to page cache.  They
>> also want to know if they successfully jumped through all the hoops to
>> get a larger than pte mapping.
>>
>> I agree it is useful to be able to force DAX on an unmodified
>> application to see what happens, and it follows that if those
>> applications want to run in that mode they will need functional
>> fsync()...
>>
>> I would feel better if we were talking about specific applications and
>> performance numbers to know if forcing DAX on application is a debug
>> facility or a production level capability.  You seem to have already
>> made that determination and I'm curious what I'm missing.
>
> I'm not setting any policy here at all.  This whole argument is
> based around the DAX mount option doing "global fs enable or
> silently turning it off" and the application not knowing about that.
>
> The whole point of having a persistent per-inode DAX flags is that
> it is a policy mechanism, not a policy.  The application can, if it
> is DAX aware, directly control whether DAX is used on a file or not.
> The application can even query and clear that persistent inode flag
> if it is configured not to (or cannot) use DAX.
>
> If the filesystem cannot support DAX, then we can error out attempts
> to set the DAX flag and then the app knows DAX is not available.
> i.e. the attempt to set policy failed. If the flag is set, then the
> inode will *always* use DAX - there is no "fall back to page cache"
> when DAX is enabled.
>
> If the applicaiton is not DAX aware, then the admin can control the
> DAX policy by manipulating these flags themselves, and hence control
> whether DAX is used by the application or not.
>
> If you think I'm dictating policy for DAX users and application,
> then you haven't understood anything I've previously said about why
> the DAX mount option needs to die before any of this is considered
> production ready. DAX is not an opaque "all or nothing" option. XFS
> will provide apps and admins with fine-grained, persistent,
> discoverable policy flags to allow admins and applications to set
> DAX policies however they see fit. This simply cannot be done if the
> only knob you have is a mount option that may or may not stick.

I agree the mount option needs to die, and I fully grok the reasoning.
  What I'm concerned with is that a system using fully-DAX-aware
applications is forced to incur the overhead of maintaining *sync
semantics, periodic sync(2) in particular,  even if it is not relying
on those semantics.

However, like I said in my other mail, we can solve that with
alternate interfaces to persistent memory if that becomes an issue and
not require that "disable *sync" capability to come through DAX.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2016-02-08 20:55 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-07  7:19 [PATCH 0/2] DAX bdev fixes - move flushing calls to FS Ross Zwisler
2016-02-07  7:19 ` Ross Zwisler
2016-02-07  7:19 ` Ross Zwisler
2016-02-07  7:19 ` [PATCH 1/2] dax: pass bdev argument to dax_clear_blocks() Ross Zwisler
2016-02-07  7:19   ` Ross Zwisler
2016-02-07  7:19   ` Ross Zwisler
2016-02-07 18:19   ` Dan Williams
2016-02-07 18:19     ` Dan Williams
2016-02-07 18:19     ` Dan Williams
2016-02-08  1:46     ` Ross Zwisler
2016-02-08  1:46       ` Ross Zwisler
2016-02-08  1:46       ` Ross Zwisler
2016-02-08  4:29       ` Ross Zwisler
2016-02-08  4:29         ` Ross Zwisler
2016-02-08  4:29         ` Ross Zwisler
2016-02-07 22:03   ` Dave Chinner
2016-02-07 22:03     ` Dave Chinner
2016-02-07 22:03     ` Dave Chinner
2016-02-08  1:44     ` Ross Zwisler
2016-02-08  1:44       ` Ross Zwisler
2016-02-08  1:44       ` Ross Zwisler
2016-02-08  5:17       ` Dave Chinner
2016-02-08  5:17         ` Dave Chinner
2016-02-08  5:17         ` Dave Chinner
2016-02-08 15:34         ` Ross Zwisler
2016-02-08 15:34           ` Ross Zwisler
2016-02-08 15:34           ` Ross Zwisler
2016-02-08 15:34           ` Ross Zwisler
2016-02-07  7:19 ` [PATCH 2/2] dax: move writeback calls into the filesystems Ross Zwisler
2016-02-07  7:19   ` Ross Zwisler
2016-02-07  7:19   ` Ross Zwisler
2016-02-07  7:19   ` Ross Zwisler
2016-02-07 19:13   ` Dan Williams
2016-02-07 19:13     ` Dan Williams
2016-02-07 19:13     ` Dan Williams
2016-02-07 21:50     ` Dave Chinner
2016-02-07 21:50       ` Dave Chinner
2016-02-07 21:50       ` Dave Chinner
2016-02-08  8:18       ` Dan Williams
2016-02-08  8:18         ` Dan Williams
2016-02-08  8:18         ` Dan Williams
2016-02-08  8:18         ` Dan Williams
2016-02-08 20:18         ` Dave Chinner
2016-02-08 20:18           ` Dave Chinner
2016-02-08 20:18           ` Dave Chinner
2016-02-08 20:55           ` Dan Williams [this message]
2016-02-08 20:55             ` Dan Williams
2016-02-08 20:55             ` Dan Williams
2016-02-08 20:58             ` Jeff Moyer
2016-02-08 20:58               ` Jeff Moyer
2016-02-08 20:58               ` Jeff Moyer
2016-02-08 20:58               ` Jeff Moyer
2016-02-08 22:05               ` Dan Williams
2016-02-08 22:05                 ` Dan Williams
2016-02-08 22:05                 ` Dan Williams
2016-02-09  9:43             ` Jan Kara
2016-02-09  9:43               ` Jan Kara
2016-02-09  9:43               ` Jan Kara
2016-02-09 16:01               ` Jan Kara
2016-02-09 16:01                 ` Jan Kara
2016-02-09 16:01                 ` Jan Kara
2016-02-09 16:01                 ` Jan Kara
2016-02-09 18:06                 ` Ross Zwisler
2016-02-09 18:06                   ` Ross Zwisler
2016-02-09 18:06                   ` Ross Zwisler
2016-02-09 18:06                   ` Ross Zwisler
2016-02-08 18:31     ` Ross Zwisler
2016-02-08 18:31       ` Ross Zwisler
2016-02-08 18:31       ` Ross Zwisler
2016-02-08 19:23       ` Dan Williams
2016-02-08 19:23         ` Dan Williams
2016-02-08 19:23         ` Dan Williams
2016-02-08 10:48   ` Jan Kara
2016-02-08 10:48     ` Jan Kara
2016-02-08 10:48     ` Jan Kara
2016-02-08 16:12     ` Ross Zwisler
2016-02-08 16:12       ` Ross Zwisler
2016-02-08 16:12       ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4iHi17pv_VC=WgEP4_GgN9OvSr8xbw1bvbEFMiQ83GbWw@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=jack@suse.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.