nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jane Chu <jane.chu@oracle.com>,
	 Vishal L Verma <vishal.l.verma@intel.com>,
	Dave Jiang <dave.jiang@intel.com>,
	 "Weiny, Ira" <ira.weiny@intel.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	 Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>,
	 Linux NVDIMM <nvdimm@lists.linux.dev>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	 linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 0/3] dax: clear poison on the fly along pwrite
Date: Fri, 17 Sep 2021 13:21:25 -0700	[thread overview]
Message-ID: <CAPcyv4iAr_Vwwgqw+4wz0RQUXhUUJGGz7_T+p+W6tC4T+k+zNw@mail.gmail.com> (raw)
In-Reply-To: <20210917152744.GA10250@magnolia>

On Fri, Sep 17, 2021 at 8:27 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> On Fri, Sep 17, 2021 at 01:53:33PM +0100, Christoph Hellwig wrote:
> > On Thu, Sep 16, 2021 at 11:40:28AM -0700, Dan Williams wrote:
> > > > That was my gut feeling.  If everyone feels 100% comfortable with
> > > > zeroingas the mechanism to clear poisoning I'll cave in.  The most
> > > > important bit is that we do that through a dedicated DAX path instead
> > > > of abusing the block layer even more.
> > >
> > > ...or just rename dax_zero_page_range() to dax_reset_page_range()?
> > > Where reset == "zero + clear-poison"?
> >
> > I'd say that naming is more confusing than overloading zero.
>
> How about dax_zeroinit_range() ?

Works for me.

>
> To go with its fallocate flag (yeah I've been too busy sorting out -rc1
> regressions to repost this) FALLOC_FL_ZEROINIT_RANGE that will reset the
> hardware (whatever that means) and set the contents to the known value
> zero.
>
> Userspace usage model:
>
> void handle_media_error(int fd, loff_t pos, size_t len)
> {
>         /* yell about this for posterior's sake */
>
>         ret = fallocate(fd, FALLOC_FL_ZEROINIT_RANGE, pos, len);
>
>         /* yay our disk drive / pmem / stone table engraver is online */

The fallocate mode can still be error-aware though, right? When the FS
has knowledge of the error locations the fallocate mode could be
fallocate(fd, FALLOC_FL_OVERWRITE_ERRORS, pos, len) with the semantics
of attempting to zero out any known poison extents in the given file
range? At the risk of going overboard on new fallocate modes there
could also (or instead of) be FALLOC_FL_PUNCH_ERRORS to skip trying to
clear them and just ask the FS to throw error extents away.

> }
>
> > > > I'm really worried about both patartitions on DAX and DM passing through
> > > > DAX because they deeply bind DAX to the block layer, which is just a bad
> > > > idea.  I think we also need to sort that whole story out before removing
> > > > the EXPERIMENTAL tags.
> > >
> > > I do think it was a mistake to allow for DAX on partitions of a pmemX
> > > block-device.
> > >
> > > DAX-reflink support may be the opportunity to start deprecating that
> > > support. Only enable DAX-reflink for direct mounting on /dev/pmemX
> > > without partitions (later add dax-device direct mounting),
> >
> > I think we need to fully or almost fully sort this out.
> >
> > Here is my bold suggestions:
> >
> >  1) drop no drop the EXPERMINTAL on the current block layer overload
> >     at all
>
> I don't understand this.
>
> >  2) add direct mounting of the nvdimm namespaces ASAP.  Because all
> >     the filesystem currently also need the /dev/pmem0 device add a way
> >     to open the block device by the dax_device instead of our current
> >     way of doing the reverse
> >  3) deprecate DAX support through block layer mounts with a say 2 year
> >     deprecation period
> >  4) add DAX remapping devices as needed
>
> What devices are needed?  linear for lvm, and maybe error so we can
> actually test all this stuff?

The proposal would be zero lvm support. The nvdimm namespace
definition would need to grow support for concatenation + striping.
Soft error injection could be achieved by writing to the badblocks
interface.

  reply	other threads:[~2021-09-17 20:21 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-14 23:31 [PATCH 0/3] dax: clear poison on the fly along pwrite Jane Chu
2021-09-14 23:31 ` [PATCH 1/3] dax: introduce dax_operation dax_clear_poison Jane Chu
2021-11-04 17:53   ` Christoph Hellwig
2021-09-14 23:31 ` [PATCH 2/3] dax: introduce dax_clear_poison to dax pwrite operation Jane Chu
2021-11-04 17:53   ` Christoph Hellwig
2021-09-14 23:31 ` [PATCH 2/3] dax: introduce dax clear poison to page aligned " Jane Chu
2021-09-14 23:31 ` [PATCH 3/3] libnvdimm/pmem: Provide pmem_dax_clear_poison for dax operation Jane Chu
2021-11-04 17:55   ` Christoph Hellwig
2021-11-04 20:27     ` Jane Chu
2021-09-15  4:44 ` [PATCH 0/3] dax: clear poison on the fly along pwrite Dan Williams
2021-09-15  7:22   ` Jane Chu
2021-09-15 16:15     ` Darrick J. Wong
2021-09-15 20:27       ` Dan Williams
2021-09-16  0:05         ` Darrick J. Wong
2021-09-16  7:11         ` Christoph Hellwig
2021-09-16 18:40           ` Dan Williams
2021-09-17 12:53             ` Christoph Hellwig
2021-09-17 15:27               ` Darrick J. Wong
2021-09-17 20:21                 ` Dan Williams [this message]
2021-09-18  0:07                   ` Darrick J. Wong
2021-09-17 19:37               ` Dan Williams
2021-09-23 20:48         ` Jane Chu
2021-09-23 20:55       ` Jane Chu
2021-09-23 21:42         ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4iAr_Vwwgqw+4wz0RQUXhUUJGGz7_T+p+W6tC4T+k+zNw@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jane.chu@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).