All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	XFS Developers <xfs@oss.sgi.com>, Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs
Date: Fri, 16 Sep 2016 09:07:48 +1000	[thread overview]
Message-ID: <20160915230748.GS30497@dastard> (raw)
In-Reply-To: <CAPcyv4jTw3cXpmmJRh7t16Xy2uYofDe+fJ+X_jnz+Q=o0uGneg@mail.gmail.com>

On Thu, Sep 15, 2016 at 10:01:03AM -0700, Dan Williams wrote:
> On Thu, Sep 15, 2016 at 1:26 AM, Christoph Hellwig <hch@lst.de> wrote:
> > On Wed, Sep 14, 2016 at 11:54:38PM -0700, Dan Williams wrote:
> >> The DAX property, page cache bypass, of a VMA is only detectable via the
> >> vma_is_dax() helper to check the S_DAX inode flag.  However, this is
> >> only available internal to the kernel and is a property that userspace
> >> applications would like to interrogate.
> >
> > They have absolutely no business knowing such an implementation detail.
> 
> Hasn't that train already left the station with FS_XFLAG_DAX?

No, that's an admin flag, not a runtime hint for applications. Just
because that flag is set on an inode, it does not mean that DAX is
actually in use - it will be ignored if the backing dev is not dax
capable.

> The other problem with hiding the DAX property is that it turns out to
> not be a transparent acceleration feature.  See xfs/086 xfs/088
> xfs/089 xfs/091 which fail with DAX and, as far as I understand, it is
> due to the fact that DAX disallows delayed allocation behavior.

Which is not a bug, nor is it something that app developers should
be surprised by.

i.e. Subtle differences in error reporting behaviour occur in
filesystems /all the time/. Run the test on a non-dax filesystem
with an extent size hint. It fails /exactly the same way as DAX/.
Run it with direct IO - fails the same way as DAX. Run it
with synchronous writes - it fails the same way as DAX.

IOWs, if an app can't handle the way DAX reports errors, then they
are /broken/. Delayed allocation requires checking the return value
of fsync() or close() to capture the allocation error - many more
apps get that wrong than the ones that expect the immediate errors
from write()...

Anyway: to domeonstrate that the nothign is actually broken, and
you might sometimes need to fix tests and send patches to
fstests@vger.kernel.org, this makes xfs/086 pass for me on DAX:

--- a/tests/xfs/086
+++ b/tests/xfs/086
@@ -96,7 +96,8 @@ _scratch_mount
 
 echo "+ modify files"
 for x in `seq 1 64`; do
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" \
+		>> $seqres.full 2>&1
 done
 umount "${SCRATCH_MNT}"
 
Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>, Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	XFS Developers <xfs@oss.sgi.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs
Date: Fri, 16 Sep 2016 09:07:48 +1000	[thread overview]
Message-ID: <20160915230748.GS30497@dastard> (raw)
In-Reply-To: <CAPcyv4jTw3cXpmmJRh7t16Xy2uYofDe+fJ+X_jnz+Q=o0uGneg@mail.gmail.com>

On Thu, Sep 15, 2016 at 10:01:03AM -0700, Dan Williams wrote:
> On Thu, Sep 15, 2016 at 1:26 AM, Christoph Hellwig <hch@lst.de> wrote:
> > On Wed, Sep 14, 2016 at 11:54:38PM -0700, Dan Williams wrote:
> >> The DAX property, page cache bypass, of a VMA is only detectable via the
> >> vma_is_dax() helper to check the S_DAX inode flag.  However, this is
> >> only available internal to the kernel and is a property that userspace
> >> applications would like to interrogate.
> >
> > They have absolutely no business knowing such an implementation detail.
> 
> Hasn't that train already left the station with FS_XFLAG_DAX?

No, that's an admin flag, not a runtime hint for applications. Just
because that flag is set on an inode, it does not mean that DAX is
actually in use - it will be ignored if the backing dev is not dax
capable.

> The other problem with hiding the DAX property is that it turns out to
> not be a transparent acceleration feature.  See xfs/086 xfs/088
> xfs/089 xfs/091 which fail with DAX and, as far as I understand, it is
> due to the fact that DAX disallows delayed allocation behavior.

Which is not a bug, nor is it something that app developers should
be surprised by.

i.e. Subtle differences in error reporting behaviour occur in
filesystems /all the time/. Run the test on a non-dax filesystem
with an extent size hint. It fails /exactly the same way as DAX/.
Run it with direct IO - fails the same way as DAX. Run it
with synchronous writes - it fails the same way as DAX.

IOWs, if an app can't handle the way DAX reports errors, then they
are /broken/. Delayed allocation requires checking the return value
of fsync() or close() to capture the allocation error - many more
apps get that wrong than the ones that expect the immediate errors
from write()...

Anyway: to domeonstrate that the nothign is actually broken, and
you might sometimes need to fix tests and send patches to
fstests@vger.kernel.org, this makes xfs/086 pass for me on DAX:

--- a/tests/xfs/086
+++ b/tests/xfs/086
@@ -96,7 +96,8 @@ _scratch_mount
 
 echo "+ modify files"
 for x in `seq 1 64`; do
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" \
+		>> $seqres.full 2>&1
 done
 umount "${SCRATCH_MNT}"
 
Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>, Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	XFS Developers <xfs@oss.sgi.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs
Date: Fri, 16 Sep 2016 09:07:48 +1000	[thread overview]
Message-ID: <20160915230748.GS30497@dastard> (raw)
In-Reply-To: <CAPcyv4jTw3cXpmmJRh7t16Xy2uYofDe+fJ+X_jnz+Q=o0uGneg@mail.gmail.com>

On Thu, Sep 15, 2016 at 10:01:03AM -0700, Dan Williams wrote:
> On Thu, Sep 15, 2016 at 1:26 AM, Christoph Hellwig <hch@lst.de> wrote:
> > On Wed, Sep 14, 2016 at 11:54:38PM -0700, Dan Williams wrote:
> >> The DAX property, page cache bypass, of a VMA is only detectable via the
> >> vma_is_dax() helper to check the S_DAX inode flag.  However, this is
> >> only available internal to the kernel and is a property that userspace
> >> applications would like to interrogate.
> >
> > They have absolutely no business knowing such an implementation detail.
> 
> Hasn't that train already left the station with FS_XFLAG_DAX?

No, that's an admin flag, not a runtime hint for applications. Just
because that flag is set on an inode, it does not mean that DAX is
actually in use - it will be ignored if the backing dev is not dax
capable.

> The other problem with hiding the DAX property is that it turns out to
> not be a transparent acceleration feature.  See xfs/086 xfs/088
> xfs/089 xfs/091 which fail with DAX and, as far as I understand, it is
> due to the fact that DAX disallows delayed allocation behavior.

Which is not a bug, nor is it something that app developers should
be surprised by.

i.e. Subtle differences in error reporting behaviour occur in
filesystems /all the time/. Run the test on a non-dax filesystem
with an extent size hint. It fails /exactly the same way as DAX/.
Run it with direct IO - fails the same way as DAX. Run it
with synchronous writes - it fails the same way as DAX.

IOWs, if an app can't handle the way DAX reports errors, then they
are /broken/. Delayed allocation requires checking the return value
of fsync() or close() to capture the allocation error - many more
apps get that wrong than the ones that expect the immediate errors
from write()...

Anyway: to domeonstrate that the nothign is actually broken, and
you might sometimes need to fix tests and send patches to
fstests@vger.kernel.org, this makes xfs/086 pass for me on DAX:

--- a/tests/xfs/086
+++ b/tests/xfs/086
@@ -96,7 +96,8 @@ _scratch_mount
 
 echo "+ modify files"
 for x in `seq 1 64`; do
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" \
+		>> $seqres.full 2>&1
 done
 umount "${SCRATCH_MNT}"
 
Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dave Chinner <david@fromorbit.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	XFS Developers <xfs@oss.sgi.com>, Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs
Date: Fri, 16 Sep 2016 09:07:48 +1000	[thread overview]
Message-ID: <20160915230748.GS30497@dastard> (raw)
In-Reply-To: <CAPcyv4jTw3cXpmmJRh7t16Xy2uYofDe+fJ+X_jnz+Q=o0uGneg@mail.gmail.com>

On Thu, Sep 15, 2016 at 10:01:03AM -0700, Dan Williams wrote:
> On Thu, Sep 15, 2016 at 1:26 AM, Christoph Hellwig <hch@lst.de> wrote:
> > On Wed, Sep 14, 2016 at 11:54:38PM -0700, Dan Williams wrote:
> >> The DAX property, page cache bypass, of a VMA is only detectable via the
> >> vma_is_dax() helper to check the S_DAX inode flag.  However, this is
> >> only available internal to the kernel and is a property that userspace
> >> applications would like to interrogate.
> >
> > They have absolutely no business knowing such an implementation detail.
> 
> Hasn't that train already left the station with FS_XFLAG_DAX?

No, that's an admin flag, not a runtime hint for applications. Just
because that flag is set on an inode, it does not mean that DAX is
actually in use - it will be ignored if the backing dev is not dax
capable.

> The other problem with hiding the DAX property is that it turns out to
> not be a transparent acceleration feature.  See xfs/086 xfs/088
> xfs/089 xfs/091 which fail with DAX and, as far as I understand, it is
> due to the fact that DAX disallows delayed allocation behavior.

Which is not a bug, nor is it something that app developers should
be surprised by.

i.e. Subtle differences in error reporting behaviour occur in
filesystems /all the time/. Run the test on a non-dax filesystem
with an extent size hint. It fails /exactly the same way as DAX/.
Run it with direct IO - fails the same way as DAX. Run it
with synchronous writes - it fails the same way as DAX.

IOWs, if an app can't handle the way DAX reports errors, then they
are /broken/. Delayed allocation requires checking the return value
of fsync() or close() to capture the allocation error - many more
apps get that wrong than the ones that expect the immediate errors
from write()...

Anyway: to domeonstrate that the nothign is actually broken, and
you might sometimes need to fix tests and send patches to
fstests@vger.kernel.org, this makes xfs/086 pass for me on DAX:

--- a/tests/xfs/086
+++ b/tests/xfs/086
@@ -96,7 +96,8 @@ _scratch_mount
 
 echo "+ modify files"
 for x in `seq 1 64`; do
-	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" >> $seqres.full
+	$XFS_IO_PROG -f -c "pwrite -S 0x62 0 ${blksz}" "${TESTFILE}.${x}" \
+		>> $seqres.full 2>&1
 done
 umount "${SCRATCH_MNT}"
 
Cheers,

Dave.

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2016-09-15 23:07 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-15  6:54 [PATCH v2 0/3] mm, dax: export dax capabilities and mapping size info to userspace Dan Williams
2016-09-15  6:54 ` Dan Williams
2016-09-15  6:54 ` Dan Williams
2016-09-15  6:54 ` Dan Williams
2016-09-15  6:54 ` [PATCH v2 1/3] mm, dax: add VM_SYNC flag for device-dax VMAs Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54 ` [PATCH v2 2/3] mm, dax: add VM_DAX flag for DAX VMAs Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  8:26   ` Christoph Hellwig
2016-09-15  8:26     ` Christoph Hellwig
2016-09-15  8:26     ` Christoph Hellwig
2016-09-15 17:01     ` Dan Williams
2016-09-15 17:01       ` Dan Williams
2016-09-15 17:01       ` Dan Williams
2016-09-15 17:01       ` Dan Williams
2016-09-15 17:09       ` Darrick J. Wong
2016-09-15 17:09         ` Darrick J. Wong
2016-09-15 17:09         ` Darrick J. Wong
2016-09-15 17:09         ` Darrick J. Wong
2016-09-15 17:44         ` Dan Williams
2016-09-15 17:44           ` Dan Williams
2016-09-15 17:44           ` Dan Williams
2016-09-15 17:44           ` Dan Williams
2016-09-15 23:07       ` Dave Chinner [this message]
2016-09-15 23:07         ` Dave Chinner
2016-09-15 23:07         ` Dave Chinner
2016-09-15 23:07         ` Dave Chinner
2016-09-15 23:19         ` Dan Williams
2016-09-15 23:19           ` Dan Williams
2016-09-15 23:19           ` Dan Williams
2016-09-15 23:19           ` Dan Williams
2016-09-16  0:16         ` Dan Williams
2016-09-16  0:16           ` Dan Williams
2016-09-16  0:16           ` Dan Williams
2016-09-16  0:16           ` Dan Williams
2016-09-16  1:24           ` Dave Chinner
2016-09-16  1:24             ` Dave Chinner
2016-09-16  1:24             ` Dave Chinner
2016-09-16  1:24             ` Dave Chinner
2016-09-16  2:04             ` Dan Williams
2016-09-16  2:04               ` Dan Williams
2016-09-16  2:04               ` Dan Williams
2016-09-16  2:04               ` Dan Williams
2016-09-16  3:41               ` Dan Williams
2016-09-16  3:41                 ` Dan Williams
2016-09-16  3:41                 ` Dan Williams
2016-09-16  3:41                 ` Dan Williams
2016-09-16  5:36               ` Dave Chinner
2016-09-16  5:36                 ` Dave Chinner
2016-09-16  5:36                 ` Dave Chinner
2016-09-16  5:36                 ` Dave Chinner
2016-09-16 10:47                 ` Dan Williams
2016-09-16 10:47                   ` Dan Williams
2016-09-16 10:47                   ` Dan Williams
2016-09-16 10:47                   ` Dan Williams
2016-09-15  6:54 ` [PATCH v2 3/3] mm, mincore2(): retrieve tlb-size attributes of an address range Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54   ` Dan Williams
2016-09-15  6:54   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160915230748.GS30497@dastard \
    --to=david@fromorbit.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=npiggin@gmail.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.