ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "y-goto@fujitsu.com" <y-goto@fujitsu.com>,
	"jack@suse.cz" <jack@suse.cz>,
	"fnstml-iaas@cn.fujitsu.com" <fnstml-iaas@cn.fujitsu.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"ocfs2-devel@oss.oracle.com" <ocfs2-devel@oss.oracle.com>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"qi.fuli@fujitsu.com" <qi.fuli@fujitsu.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [Ocfs2-devel] Question about the "EXPERIMENTAL" tag for dax in XFS
Date: Mon, 1 Mar 2021 21:50:21 -0800	[thread overview]
Message-ID: <CAPcyv4g03VU8Z8MarMOrW6oJpO-4QFe9-P=CFu3C0z6St3vLuQ@mail.gmail.com> (raw)
In-Reply-To: <20210302053828.GI4662@dread.disaster.area>

On Mon, Mar 1, 2021 at 9:38 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Mon, Mar 01, 2021 at 07:33:28PM -0800, Dan Williams wrote:
> > On Mon, Mar 1, 2021 at 6:42 PM Dave Chinner <david@fromorbit.com> wrote:
> > [..]
> > > We do not need a DAX specific mechanism to tell us "DAX device
> > > gone", we need a generic block device interface that tells us "range
> > > of block device is gone".
> >
> > This is the crux of the disagreement. The block_device is going away
> > *and* the dax_device is going away.
>
> No, that is not the disagreement I have with what you are saying.
> You still haven't understand that it's even more basic and generic
> than devices going away. At the simplest form, all the filesystem
> wants is to be notified of is when *unrecoverable media errors*
> occur in the persistent storage that underlies the filesystem.
>
> The filesystem does not care what that media is build from - PMEM,
> flash, corroded spinning disks, MRAM, or any other persistent media
> you can think off. It just doesn't matter.
>
> What we care about is that the contents of a *specific LBA range* no
> longer contain *valid data*. IOWs, the data in that range of the
> block device has been lost, cannot be retreived and/or cannot be
> written to any more.
>
> PMEM taking a MCE because ECC tripped is a media error because data
> is lost and inaccessible until recovery actions are taken.
>
> MD RAID failing a scrub is a media error and data is lost and
> unrecoverable at that layer.
>
> A device disappearing is a media error because the storage media is
> now permanently inaccessible to the higher layers.
>
> This "media error" categorisation is a fundamental property of
> persistent storage and, as such, is a property of the block devices
> used to access said persistent storage.
>
> That's the disagreement here - that you and Christoph are saying
> ->corrupted_range is not a block device property because only a
> pmem/DAX device currently generates it.
>
> You both seem to be NACKing a generic interface because it's only
> implemented for the first subsystem that needs it. AFAICT, you
> either don't understand or are completely ignoring the architectural
> need for it to be provided across the rest of the storage stack that
> *block device based filesystems depend on*.

No I'm NAKing it because it's the wrong interface. See my 'struct
badblocks' argument in the reply to Darrick. That 'struct badblocks'
infrastructure arose from MD and is shared with PMEM.

>
> Sure, there might be dax device based fielsystems around the corner.
> They just require a different pmem device ->corrupted_range callout
> to implement the notification - one that directs to the dax device
> rather than the block device. That's simple and trivial to
> implement, but such functionaity for DAX devices  does not replace
> the need for the same generic functionality to be provided across a
> *range of different block devices* as required by *block device
> based filesystems*.
>
> And that's fundamentally the problem. XFS is block device based, not
> DAX device based. We require errors to be reported through block
> device mechanisms. fs-dax does not change this - it is based on pmem
> being presented as a primarily as a block device to the block device
> based filesystems and only secondarily as a dax device. Hence if it
> can be trivially implemented as a block device interface, that's
> where it should go, because then all the other block devices that
> the filesytem runs on can provide the same functionality for similar
> media error events....

Sure, use 'struct badblocks' not struct block_device and
block_device_operations.
>
> > The dax_device removal implies one
> > set of actions (direct accessed pfns invalid) the block device removal
> > implies another (block layer sector access offline).
>
> There you go again, saying DAX requires an action, while the block
> device notification is a -state change- (i.e. goes offline).

There you go reacting to the least generous interpretation of what I said.

s/pfns invalid/pfns offline/

>
> This is exactly what I said was wrong in my last email.
>
> > corrupted_range
> > is blurring the notification for 2 different failure domains. Look at
> > the nascent idea to mount a filesystem on dax sans a block device.
> > Look at the existing plumbing for DM to map dax_operations through a
> > device stack.
>
> Ummm, it just maps the direct_access call to the underlying device
> and calls it's ->direct_access method. All it's doing is LBA
> mapping. That's all it needs to do for ->corrupted_range, too.
> I have no clue why you think this is a problem for error
> notification...
>
> > Look at the pushback Ruan got for adding a new
> > block_device operation for corrupted_range().
>
> one person said "no". That's hardly pushback. Especially as I think
> Christoph's objection about this being dax specific functionality
> is simply wrong, as per above.

It's not wrong when we have a perfectly suitable object for sector
based error notification and when we're trying to disentangle 'struct
block_device' from 'struct dax_device'.

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

  reply	other threads:[~2021-03-02  5:50 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-26  0:20 [Ocfs2-devel] [PATCH v2 00/10] fsdax, xfs: Add reflink&dedupe support for fsdax Shiyang Ruan
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 01/10] fsdax: Factor helpers to simplify dax fault code Shiyang Ruan
2021-03-03  9:13   ` Christoph Hellwig
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 02/10] fsdax: Factor helper: dax_fault_actor() Shiyang Ruan
2021-03-03  9:28   ` Christoph Hellwig
2021-03-12  9:01     ` ruansy.fnst
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 03/10] fsdax: Output address in dax_iomap_pfn() and rename it Shiyang Ruan
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 05/10] fsdax: Replace mmap entry in case of CoW Shiyang Ruan
2021-03-03  9:30   ` Christoph Hellwig
2021-03-03  9:41     ` ruansy.fnst
2021-03-03  9:44       ` Christoph Hellwig
2021-03-03  9:48   ` Christoph Hellwig
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 08/10] fsdax: Dedup file range to use a compare function Shiyang Ruan
2021-02-26  8:28   ` Shiyang Ruan
2021-03-03  8:20   ` Joe Perches
2021-03-03  8:45     ` ruansy.fnst
2021-03-03  9:04       ` Joe Perches
2021-03-03  9:39       ` hch
2021-03-03  9:46         ` ruansy.fnst
2021-03-04  5:42   ` [Ocfs2-devel] [RESEND PATCH v2.1 " Shiyang Ruan
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 09/10] fs/xfs: Handle CoW for fsdax write() path Shiyang Ruan
2021-03-03  9:43   ` Christoph Hellwig
2021-03-03  9:57     ` ruansy.fnst
2021-03-03 10:43       ` Christoph Hellwig
2021-03-04  1:35         ` ruansy.fnst
2021-02-26  0:20 ` [Ocfs2-devel] [PATCH v2 10/10] fs/xfs: Add dedupe support for fsdax Shiyang Ruan
2021-02-26  9:45 ` [Ocfs2-devel] Question about the "EXPERIMENTAL" tag for dax in XFS ruansy.fnst
2021-02-26 19:04   ` Darrick J. Wong
2021-02-26 19:24     ` Dan Williams
2021-02-26 20:51       ` Dave Chinner
2021-02-26 20:59         ` Dan Williams
2021-02-26 21:27           ` Dave Chinner
2021-02-26 22:41             ` Dan Williams
2021-02-27 22:36               ` Dave Chinner
2021-02-27 23:40                 ` Dan Williams
2021-02-28 22:38                   ` Dave Chinner
2021-03-01 20:55                     ` Dan Williams
2021-03-01 22:46                       ` Dave Chinner
2021-03-02  0:32                         ` Dan Williams
2021-03-02  2:42                           ` Dave Chinner
2021-03-02  3:33                             ` Dan Williams
2021-03-02  5:38                               ` Dave Chinner
2021-03-02  5:50                                 ` Dan Williams [this message]
2021-03-02  3:28                       ` Darrick J. Wong
2021-03-02  5:41                         ` Dan Williams
2021-03-02  7:57                           ` Dave Chinner
2021-03-02 17:49                             ` Dan Williams
2021-03-04 23:40                               ` Darrick J. Wong
2021-03-01  7:26       ` Yasunori Goto
2021-03-01 21:34         ` Dan Williams
     [not found] ` <20210226002030.653855-5-ruansy.fnst@fujitsu.com>
2021-03-03  9:29   ` [Ocfs2-devel] [PATCH v2 04/10] fsdax: Introduce dax_iomap_cow_copy() Christoph Hellwig
     [not found] ` <20210226002030.653855-7-ruansy.fnst@fujitsu.com>
2021-03-03  9:31   ` [Ocfs2-devel] [PATCH v2 06/10] fsdax: Add dax_iomap_cow_copy() for dax_iomap_zero Christoph Hellwig
     [not found] ` <20210226002030.653855-8-ruansy.fnst@fujitsu.com>
2021-02-26  4:14   ` [Ocfs2-devel] [PATCH v2 07/10] iomap: Introduce iomap_apply2() for operations on two files Darrick J. Wong
2021-02-26  8:11     ` ruansy.fnst
2021-02-26  8:25   ` Shiyang Ruan
2021-03-04  5:41   ` [Ocfs2-devel] [RESEND PATCH v2.1 " Shiyang Ruan
2021-03-11 12:30     ` Christoph Hellwig
2021-03-09  6:36 ` [Ocfs2-devel] [PATCH v2 00/10] fsdax, xfs: Add reflink&dedupe support for fsdax Xiaoguang Wang
2021-03-10  1:32   ` ruansy.fnst
2021-03-09 16:19 ` Goldwyn Rodrigues
2021-03-10  1:26   ` ruansy.fnst
2021-03-10 12:30 ` Neal Gompa
2021-03-10 13:02   ` Matthew Wilcox
2021-03-10 13:36     ` Neal Gompa
2021-03-10 13:55       ` Matthew Wilcox
2021-03-10 14:21     ` Goldwyn Rodrigues
2021-03-10 14:26       ` Matthew Wilcox
2021-03-10 17:04         ` Goldwyn Rodrigues
2021-03-11  0:53         ` Dan Williams
2021-03-11  8:26           ` Neal Gompa
2021-03-13 13:07         ` Adam Borowski
2021-03-13 16:24           ` Neal Gompa
2021-03-13 22:00             ` Adam Borowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4g03VU8Z8MarMOrW6oJpO-4QFe9-P=CFu3C0z6St3vLuQ@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=fnstml-iaas@cn.fujitsu.com \
    --cc=jack@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    --cc=qi.fuli@fujitsu.com \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).