All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Jeff Layton <jlayton@poochiereds.net>, Jan Kara <jack@suse.cz>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-nvdimm@lists.01.org, Dave Chinner <david@fromorbit.com>,
	linux-kernel@vger.kernel.org,
	"J. Bruce Fields" <bfields@fieldses.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-xfs@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH 1/7] xfs: always use DAX if mount option is used
Date: Tue, 26 Sep 2017 11:30:57 -0600	[thread overview]
Message-ID: <20170926173057.GB20159@linux.intel.com> (raw)
In-Reply-To: <20170926143743.GB18758@lst.de>

On Tue, Sep 26, 2017 at 04:37:43PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 26, 2017 at 09:09:57PM +1000, Dave Chinner wrote:
> > Well, quite frankly, I never wanted the mount option for XFS. It was
> > supposed to be for initial testing only, then we'd /always/ use the
> > the inode flags. For a filesystem to default to using DAX, we
> > set the DAX flag on the root inode at mkfs time, and then everything
> > inode flag based just works.
> 
> And I deeply fundamentally disagree.  The mount option is a nice
> enough big hammer to try a mode without encoding nitty gritty details
> into the application ABI.
> 
> The per-inode persistent flag is the biggest nightmare ever, as we see
> in all these discussions about it.
> 
> What does it even mean?  Right now it forces direct addressing as long
> as the underlying media supports that.  But what about media that
> you directly access but you really don't want to because it's really slow?
> Or media that is so god damn fast that you never want to buffer?  Or
> media where you want to buffer for writes (or at least some of them)
> but not for reads?
> 
> It encodes a very specific mechanism for an early direct access
> implementation into the ABI.  What we really need is for applications
> to declare an intent, not specify a particular mechanism.

I agree that Christoph's idea about having the system intelligently adjust to
use DAX based on performance information it gathers about the underlying
persistent memory (probably via the HMAT on x86_64 systems) is interesting,
but I think we're still a ways away from that.

FWIW, as my patches suggest and Jan observed I think that we should allow
users to turn on DAX by treating the inode flag and the mount flag as an 'or'
operation.  i.e. you get DAX if either the mount option is specified or if the
inode flag is set, and you can continue to manipulate the per-inode flag as
you want regardless of the mount option.  I think this provides maximum
flexibility of the mechanism to select DAX without enforcing policy.

In the end, though, I think what's really important is that we figure out what
the various options mean, have the same story for both XFS and ext4, and
document it as hch suggested in response to my patch 7 in this series.

Does it make sense at this point to just start a "dax" man page that can
contain info about the mount options, inode flags, kernel config options, how
to get PMDs, etc?  Or does this documentation need to be sprinkled around more
in existing man pages?
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/7] xfs: always use DAX if mount option is used
Date: Tue, 26 Sep 2017 11:30:57 -0600	[thread overview]
Message-ID: <20170926173057.GB20159@linux.intel.com> (raw)
In-Reply-To: <20170926143743.GB18758@lst.de>

On Tue, Sep 26, 2017 at 04:37:43PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 26, 2017 at 09:09:57PM +1000, Dave Chinner wrote:
> > Well, quite frankly, I never wanted the mount option for XFS. It was
> > supposed to be for initial testing only, then we'd /always/ use the
> > the inode flags. For a filesystem to default to using DAX, we
> > set the DAX flag on the root inode at mkfs time, and then everything
> > inode flag based just works.
> 
> And I deeply fundamentally disagree.  The mount option is a nice
> enough big hammer to try a mode without encoding nitty gritty details
> into the application ABI.
> 
> The per-inode persistent flag is the biggest nightmare ever, as we see
> in all these discussions about it.
> 
> What does it even mean?  Right now it forces direct addressing as long
> as the underlying media supports that.  But what about media that
> you directly access but you really don't want to because it's really slow?
> Or media that is so god damn fast that you never want to buffer?  Or
> media where you want to buffer for writes (or at least some of them)
> but not for reads?
> 
> It encodes a very specific mechanism for an early direct access
> implementation into the ABI.  What we really need is for applications
> to declare an intent, not specify a particular mechanism.

I agree that Christoph's idea about having the system intelligently adjust to
use DAX based on performance information it gathers about the underlying
persistent memory (probably via the HMAT on x86_64 systems) is interesting,
but I think we're still a ways away from that.

FWIW, as my patches suggest and Jan observed I think that we should allow
users to turn on DAX by treating the inode flag and the mount flag as an 'or'
operation.  i.e. you get DAX if either the mount option is specified or if the
inode flag is set, and you can continue to manipulate the per-inode flag as
you want regardless of the mount option.  I think this provides maximum
flexibility of the mechanism to select DAX without enforcing policy.

In the end, though, I think what's really important is that we figure out what
the various options mean, have the same story for both XFS and ext4, and
document it as hch suggested in response to my patch 7 in this series.

Does it make sense at this point to just start a "dax" man page that can
contain info about the mount options, inode flags, kernel config options, how
to get PMDs, etc?  Or does this documentation need to be sprinkled around more
in existing man pages?

WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>, Jan Kara <jack@suse.cz>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-nvdimm@lists.01.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/7] xfs: always use DAX if mount option is used
Date: Tue, 26 Sep 2017 11:30:57 -0600	[thread overview]
Message-ID: <20170926173057.GB20159@linux.intel.com> (raw)
In-Reply-To: <20170926143743.GB18758@lst.de>

On Tue, Sep 26, 2017 at 04:37:43PM +0200, Christoph Hellwig wrote:
> On Tue, Sep 26, 2017 at 09:09:57PM +1000, Dave Chinner wrote:
> > Well, quite frankly, I never wanted the mount option for XFS. It was
> > supposed to be for initial testing only, then we'd /always/ use the
> > the inode flags. For a filesystem to default to using DAX, we
> > set the DAX flag on the root inode at mkfs time, and then everything
> > inode flag based just works.
> 
> And I deeply fundamentally disagree.  The mount option is a nice
> enough big hammer to try a mode without encoding nitty gritty details
> into the application ABI.
> 
> The per-inode persistent flag is the biggest nightmare ever, as we see
> in all these discussions about it.
> 
> What does it even mean?  Right now it forces direct addressing as long
> as the underlying media supports that.  But what about media that
> you directly access but you really don't want to because it's really slow?
> Or media that is so god damn fast that you never want to buffer?  Or
> media where you want to buffer for writes (or at least some of them)
> but not for reads?
> 
> It encodes a very specific mechanism for an early direct access
> implementation into the ABI.  What we really need is for applications
> to declare an intent, not specify a particular mechanism.

I agree that Christoph's idea about having the system intelligently adjust to
use DAX based on performance information it gathers about the underlying
persistent memory (probably via the HMAT on x86_64 systems) is interesting,
but I think we're still a ways away from that.

FWIW, as my patches suggest and Jan observed I think that we should allow
users to turn on DAX by treating the inode flag and the mount flag as an 'or'
operation.  i.e. you get DAX if either the mount option is specified or if the
inode flag is set, and you can continue to manipulate the per-inode flag as
you want regardless of the mount option.  I think this provides maximum
flexibility of the mechanism to select DAX without enforcing policy.

In the end, though, I think what's really important is that we figure out what
the various options mean, have the same story for both XFS and ext4, and
document it as hch suggested in response to my patch 7 in this series.

Does it make sense at this point to just start a "dax" man page that can
contain info about the mount options, inode flags, kernel config options, how
to get PMDs, etc?  Or does this documentation need to be sprinkled around more
in existing man pages?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-09-26 17:27 UTC|newest]

Thread overview: 130+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-25 23:13 [PATCH 0/7] re-enable XFS per-inode DAX Ross Zwisler
2017-09-25 23:13 ` Ross Zwisler
2017-09-25 23:13 ` Ross Zwisler
2017-09-25 23:13 ` [PATCH 1/7] xfs: always use DAX if mount option is used Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-25 23:38   ` Dave Chinner
2017-09-25 23:38     ` Dave Chinner
2017-09-25 23:38     ` Dave Chinner
2017-09-26  9:35     ` Jan Kara
2017-09-26  9:35       ` Jan Kara
2017-09-26  9:35       ` Jan Kara
2017-09-26 11:09       ` Dave Chinner
2017-09-26 11:09         ` Dave Chinner
2017-09-26 11:09         ` Dave Chinner
2017-09-26 14:37         ` Christoph Hellwig
2017-09-26 14:37           ` Christoph Hellwig
2017-09-26 17:30           ` Ross Zwisler [this message]
2017-09-26 17:30             ` Ross Zwisler
2017-09-26 17:30             ` Ross Zwisler
2017-09-26 19:48             ` Darrick J. Wong
2017-09-26 19:48               ` Darrick J. Wong
2017-09-26 22:00               ` Dave Chinner
2017-09-26 22:00                 ` Dave Chinner
2017-09-26 22:00                 ` Dave Chinner
2017-09-27  6:40             ` Christoph Hellwig
2017-09-27  6:40               ` Christoph Hellwig
2017-09-27  6:40               ` Christoph Hellwig
2017-09-27 16:15               ` Ross Zwisler
2017-09-27 16:15                 ` Ross Zwisler
2017-10-01  8:17                 ` Christoph Hellwig
2017-10-01  8:17                   ` Christoph Hellwig
2017-10-01  8:17                   ` Christoph Hellwig
2017-09-26 18:02         ` Eric Sandeen
2017-09-26 18:02           ` Eric Sandeen
2017-09-26 18:02           ` Eric Sandeen
2017-09-26 18:50     ` Ross Zwisler
2017-09-26 18:50       ` Ross Zwisler
2017-09-26 18:50       ` Ross Zwisler
2017-09-25 23:13 ` [PATCH 2/7] xfs: validate bdev support for DAX inode flag Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-25 23:13   ` Ross Zwisler
2017-09-26  6:36   ` Christoph Hellwig
2017-09-26  6:36     ` Christoph Hellwig
2017-09-26  6:36     ` Christoph Hellwig
2017-09-26 17:16     ` Ross Zwisler
2017-09-26 17:16       ` Ross Zwisler
2017-09-26 17:16       ` Ross Zwisler
2017-09-26 17:57       ` Darrick J. Wong
2017-09-26 17:57         ` Darrick J. Wong
2017-09-25 23:14 ` [PATCH 3/7] xfs: protect S_DAX transitions in XFS read path Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:27   ` Dave Chinner
2017-09-25 23:27     ` Dave Chinner
2017-09-25 23:27     ` Dave Chinner
2017-09-26  6:32   ` Christoph Hellwig
2017-09-26  6:32     ` Christoph Hellwig
2017-09-26  6:32     ` Christoph Hellwig
2017-09-26 13:59     ` Dan Williams
2017-09-26 13:59       ` Dan Williams
2017-09-26 13:59       ` Dan Williams
2017-09-26 14:33       ` Christoph Hellwig
2017-09-26 14:33         ` Christoph Hellwig
2017-09-26 14:33         ` Christoph Hellwig
2017-09-26 18:11         ` Dan Williams
2017-09-26 18:11           ` Dan Williams
2017-10-01  8:17           ` Christoph Hellwig
2017-10-01  8:17             ` Christoph Hellwig
2017-10-01  8:17             ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 4/7] xfs: protect S_DAX transitions in XFS write path Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:29   ` Dave Chinner
2017-09-25 23:29     ` Dave Chinner
2017-09-25 23:29     ` Dave Chinner
2017-09-25 23:14 ` [PATCH 5/7] xfs: introduce xfs_is_dax_state_changing Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-26  6:33   ` Christoph Hellwig
2017-09-26  6:33     ` Christoph Hellwig
2017-09-26  6:33     ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 6/7] mm, fs: introduce file_operations->post_mmap() Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:38   ` Dan Williams
2017-09-25 23:38     ` Dan Williams
2017-09-26 18:57     ` Ross Zwisler
2017-09-26 18:57       ` Ross Zwisler
2017-09-26 18:57       ` Ross Zwisler
2017-09-26 19:19       ` Dan Williams
2017-09-26 19:19         ` Dan Williams
2017-09-26 19:19         ` Dan Williams
2017-09-26 21:06         ` Ross Zwisler
2017-09-26 21:06           ` Ross Zwisler
2017-09-26 21:06           ` Ross Zwisler
2017-09-26 21:41           ` Dan Williams
2017-09-26 21:41             ` Dan Williams
2017-09-26 21:41             ` Dan Williams
2017-09-27 11:35             ` Jan Kara
2017-09-27 11:35               ` Jan Kara
2017-09-27 11:35               ` Jan Kara
2017-09-27 14:00               ` Dan Williams
2017-09-27 14:00                 ` Dan Williams
2017-09-27 14:00                 ` Dan Williams
2017-09-27 15:07                 ` Jan Kara
2017-09-27 15:07                   ` Jan Kara
2017-09-27 15:07                   ` Jan Kara
2017-09-27 15:36                   ` Dan Williams
2017-09-27 15:36                     ` Dan Williams
2017-09-27 15:39               ` Ross Zwisler
2017-09-27 15:39                 ` Ross Zwisler
2017-09-27 15:39                 ` Ross Zwisler
2017-09-27 15:54                 ` Dan Williams
2017-09-27 15:54                   ` Dan Williams
2017-09-27 15:54                   ` Dan Williams
2017-09-26  6:34   ` Christoph Hellwig
2017-09-26  6:34     ` Christoph Hellwig
2017-09-26  6:34     ` Christoph Hellwig
2017-09-25 23:14 ` [PATCH 7/7] xfs: re-enable XFS per-inode DAX Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-25 23:14   ` Ross Zwisler
2017-09-26  0:31   ` Dave Chinner
2017-09-26  0:31     ` Dave Chinner
2017-09-26  0:31     ` Dave Chinner
2017-09-26  6:36   ` Christoph Hellwig
2017-09-26  6:36     ` Christoph Hellwig
2017-09-26 19:01     ` Ross Zwisler
2017-09-26 19:01       ` Ross Zwisler
2017-09-26 19:01       ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170926173057.GB20159@linux.intel.com \
    --to=ross.zwisler@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.