linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Jane Chu <jane.chu@oracle.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCHSET RFC v2 jane 0/5] vfs: enable userspace to reset damaged file storage
Date: Sat, 18 Sep 2021 11:05:13 -0700	[thread overview]
Message-ID: <CAPcyv4i8qcDSFvRSMJTKQhedgbJ5c_cMKH3+FSX_veoLNVSvqQ@mail.gmail.com> (raw)
In-Reply-To: <163192864476.417973.143014658064006895.stgit@magnolia>

On Fri, Sep 17, 2021 at 6:30 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> Hi all,
>
> Jane Chu has taken an interest in trying to fix the pmem poison recovery
> story on Linux.  Since I sort of had a half-baked patchset that seems to
> contain some elements of what the reviewers of her patchset wanted, I'm
> releasing this reworked version to see if it has any traction.
>
> Our current "advice" to people using persistent memory and FSDAX who
> wish to recover upon receipt of a media error (aka 'hwpoison') event
> from ACPI is to punch-hole that part of the file and then pwrite it,
> which will magically cause the pmem to be reinitialized and the poison
> to be cleared.
>
> Punching doesn't make any sense at all -- the (re)allocation on pwrite
> does not permit the caller to specify where to find blocks, which means
> that we might not get the same pmem back.

Not sure this is a driving concern. If you get the same pmem back it
will have gone through a poison clearing cycle when it gets
reallocated. Also, once the filesystem gets notified of error
locations through Ruan's series the FS can avoid allocating blocks
where poison clearing failed.

> This pushes the user farther
> away from the goal of reinitializing poisoned memory and leads to
> complaints about unnecessary file fragmentation.

Fragmentation though is a valid concern.

>
> AFAICT, the only reason why the "punch and write" dance works at all is
> that the XFS and ext4 currently call blkdev_issue_zeroout when
> allocating pmem ahead of a write call.  Even a regular overwrite won't
> clear the poison, because dax_direct_access is smart enough to bail out
> on poisoned pmem, but not smart enough to clear it.

Alignment constraints were the entanglement that kept DAX from poison
clearing. This is similar to the dance you need to do to get a disk to
remap a bad block, which needs an O_DIRECT write. It was also deemed
messy to keep overloading writes this way.

> To be fair, that
> function maps pages and has no idea what kinds of reads and writes the
> caller might want to perform.
>
> Therefore, clean up this whole mess by creating a dax_zeroinit_range
> function that callers can use on poisoned persistent memory to reset the
> contents of the persistent memory to a known state (all zeroes) and
> clear any lingering poison state that might be lingering in the memory
> controllers.  Create a new fallocate mode to trigger this functionality,
> then wire up XFS and ext4 to use it.  For good measure, wire it up to
> traditional storage if the storage has a fast way to zero LBA contents,
> since we assume that those LBAs won't hit old media errors.

Sounds good, I'll take a look at the rest.

  parent reply	other threads:[~2021-09-18 18:05 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-18  1:30 [PATCHSET RFC v2 jane 0/5] vfs: enable userspace to reset damaged file storage Darrick J. Wong
2021-09-18  1:30 ` [PATCH 1/5] dax: prepare pmem for use by zero-initializing contents and clearing poisons Darrick J. Wong
2021-09-18 16:54   ` riteshh
2021-09-20 17:22     ` Darrick J. Wong
2021-09-21  4:07       ` riteshh
2021-09-22 18:26         ` Darrick J. Wong
2021-09-22 19:47           ` riteshh
2021-09-22 20:26           ` Dan Williams
2021-09-21  8:34   ` Christoph Hellwig
2021-09-22 18:10     ` Darrick J. Wong
2021-09-18  1:30 ` [PATCH 2/5] iomap: use accelerated zeroing on a block device to zero a file range Darrick J. Wong
2021-09-18 16:55   ` riteshh
2021-09-21  8:29   ` Christoph Hellwig
2021-09-22 18:53     ` Darrick J. Wong
2021-09-21 22:33   ` Dave Chinner
2021-09-22 18:54     ` Darrick J. Wong
2021-09-18  1:31 ` [PATCH 3/5] vfs: add a zero-initialization mode to fallocate Darrick J. Wong
2021-09-18 16:58   ` riteshh
2021-09-20 17:52   ` Eric Biggers
2021-09-20 18:06     ` Darrick J. Wong
2021-09-21  0:44   ` Dave Chinner
2021-09-21  8:31     ` Christoph Hellwig
2021-09-22  2:16       ` Dan Williams
2021-09-22  2:38         ` Darrick J. Wong
2021-09-22  3:59           ` Dave Chinner
2021-09-22  4:13             ` Darrick J. Wong
2021-09-22  5:49               ` Dave Chinner
2021-09-22 21:27                 ` Darrick J. Wong
2021-09-23  0:02                   ` Darrick J. Wong
2021-09-23  0:44                     ` Darrick J. Wong
2021-09-23  1:42                     ` Dave Chinner
2021-09-23  2:43                       ` Dan Williams
2021-09-23  5:42                         ` Dan Williams
2021-09-23 22:54                           ` Dave Chinner
2021-09-24  1:18                             ` Dan Williams
2021-09-24  1:21                               ` Jane Chu
2021-09-24  1:35                                 ` Darrick J. Wong
2021-09-27 21:07                                   ` Dave Chinner
2021-09-27 21:57                                     ` Jane Chu
2021-09-28  0:08                                       ` Dan Williams
2021-09-22  5:28     ` riteshh
2021-09-18  1:31 ` [PATCH 4/5] xfs: implement FALLOC_FL_ZEROINIT_RANGE Darrick J. Wong
2021-09-18  1:31 ` [PATCH 5/5] ext4: " Darrick J. Wong
2021-09-18 17:07   ` riteshh
2021-09-20 18:11     ` Darrick J. Wong
2021-09-21  6:10       ` riteshh
2021-09-18 18:05 ` Dan Williams [this message]
2021-09-23  0:51 ` [PATCHSET RFC v2 jane 0/5] vfs: enable userspace to reset damaged file storage Darrick J. Wong
2021-09-23  1:17   ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4i8qcDSFvRSMJTKQhedgbJ5c_cMKH3+FSX_veoLNVSvqQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=jane.chu@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).