All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Jane Chu <jane.chu@oracle.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	 "david@fromorbit.com" <david@fromorbit.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	 "dave.jiang@intel.com" <dave.jiang@intel.com>,
	"agk@redhat.com" <agk@redhat.com>,
	 "snitzer@redhat.com" <snitzer@redhat.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	 "ira.weiny@intel.com" <ira.weiny@intel.com>,
	"willy@infradead.org" <willy@infradead.org>,
	 "vgoyal@redhat.com" <vgoyal@redhat.com>,
	 "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	 "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag
Date: Thu, 4 Nov 2021 09:08:41 -0700	[thread overview]
Message-ID: <CAPcyv4jaCj=qDw-MHEcYjVGHYGvX8wbJ_d3kv5nnv7agHnMViQ@mail.gmail.com> (raw)
In-Reply-To: <YYObn+0juAFvH7Fk@infradead.org>

On Thu, Nov 4, 2021 at 1:36 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Nov 03, 2021 at 11:21:39PM -0700, Dan Williams wrote:
> > The concern I have with dax_clear_poison() is that it precludes atomic
> > error clearing.
>
> atomic as in clear poison and write the actual data?  Yes, that would
> be useful, but it is not how the Intel pmem support actually works, right?

Yes, atomic clear+write new data. The ability to atomic clear requires
either a CPU with the ability to overwrite cachelines without doing a
RMW cycle (MOVDIR64B), or it requires a device with a suitable
slow-path mailbox command like the one defined for CXL devices (see
section 8.2.9.5.4.3 Clear Poison in CXL 2.0).

I don't know why you think these devices don't perform wear-leveling
with spare blocks?

> > Also, as Boris and I discussed, poisoned pages should
> > be marked NP (not present) rather than UC (uncacheable) [1].
>
> This would not really have an affect on the series, right?  But yes,
> that seems like the right thing to do.

It would because the implementations would need to be careful to clear
poison in an entire page before any of it could be accessed. With an
enlightened write-path RWF flag or custom fault handler it could do
sub-page overwrites of poison. Not that I think the driver should
optimize for multiple failed cachelines in a page, but it does mean
dax_clear_poison() fails in more theoretical scenarios.

> > With
> > those 2 properties combined I think that wants a custom pmem fault
> > handler that knows how to carefully write to pmem pages with poison
> > present, rather than an additional explicit dax-operation. That also
> > meets Christoph's requirement of "works with the intended direct
> > memory map use case".
>
> So we have 3 kinds of accesses to DAX memory:
>
>  (1) user space mmap direct access.
>  (2) iov_iter based access (could be from kernel or userspace)
>  (3) open coded kernel access using ->direct_access
>
> One thing I noticed:  (2) could also work with kernel memory or pages,
> but that doesn't use MC safe access.

Yes, but after the fight to even get copy_mc_to_kernel() to exist for
pmem_copy_to_iter() I did not have the nerve to push for wider usage.

> Which seems like a major independent
> of this discussion.
>
> I suspect all kernel access could work fine with a copy_mc_to_kernel
> helper as long as everyone actually uses it,

All kernel accesses do use it. They either route to
pmem_copy_to_iter(), or like dm-writecache, call it directly. Do you
see a kernel path that does not use that helper?

> missing required bits of (2) and (3) together with something like the
> ->clear_poison series from Jane. We just need to think hard what we
> want to do for userspace mmap access.

dax_clear_poison() is at least ready to go today and does not preclude
adding the atomic and finer grained support later.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Jane Chu <jane.chu@oracle.com>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"snitzer@redhat.com" <snitzer@redhat.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"willy@infradead.org" <willy@infradead.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"vgoyal@redhat.com" <vgoyal@redhat.com>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"agk@redhat.com" <agk@redhat.com>
Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag
Date: Thu, 4 Nov 2021 09:08:41 -0700	[thread overview]
Message-ID: <CAPcyv4jaCj=qDw-MHEcYjVGHYGvX8wbJ_d3kv5nnv7agHnMViQ@mail.gmail.com> (raw)
In-Reply-To: <YYObn+0juAFvH7Fk@infradead.org>

On Thu, Nov 4, 2021 at 1:36 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Nov 03, 2021 at 11:21:39PM -0700, Dan Williams wrote:
> > The concern I have with dax_clear_poison() is that it precludes atomic
> > error clearing.
>
> atomic as in clear poison and write the actual data?  Yes, that would
> be useful, but it is not how the Intel pmem support actually works, right?

Yes, atomic clear+write new data. The ability to atomic clear requires
either a CPU with the ability to overwrite cachelines without doing a
RMW cycle (MOVDIR64B), or it requires a device with a suitable
slow-path mailbox command like the one defined for CXL devices (see
section 8.2.9.5.4.3 Clear Poison in CXL 2.0).

I don't know why you think these devices don't perform wear-leveling
with spare blocks?

> > Also, as Boris and I discussed, poisoned pages should
> > be marked NP (not present) rather than UC (uncacheable) [1].
>
> This would not really have an affect on the series, right?  But yes,
> that seems like the right thing to do.

It would because the implementations would need to be careful to clear
poison in an entire page before any of it could be accessed. With an
enlightened write-path RWF flag or custom fault handler it could do
sub-page overwrites of poison. Not that I think the driver should
optimize for multiple failed cachelines in a page, but it does mean
dax_clear_poison() fails in more theoretical scenarios.

> > With
> > those 2 properties combined I think that wants a custom pmem fault
> > handler that knows how to carefully write to pmem pages with poison
> > present, rather than an additional explicit dax-operation. That also
> > meets Christoph's requirement of "works with the intended direct
> > memory map use case".
>
> So we have 3 kinds of accesses to DAX memory:
>
>  (1) user space mmap direct access.
>  (2) iov_iter based access (could be from kernel or userspace)
>  (3) open coded kernel access using ->direct_access
>
> One thing I noticed:  (2) could also work with kernel memory or pages,
> but that doesn't use MC safe access.

Yes, but after the fight to even get copy_mc_to_kernel() to exist for
pmem_copy_to_iter() I did not have the nerve to push for wider usage.

> Which seems like a major independent
> of this discussion.
>
> I suspect all kernel access could work fine with a copy_mc_to_kernel
> helper as long as everyone actually uses it,

All kernel accesses do use it. They either route to
pmem_copy_to_iter(), or like dm-writecache, call it directly. Do you
see a kernel path that does not use that helper?

> missing required bits of (2) and (3) together with something like the
> ->clear_poison series from Jane. We just need to think hard what we
> want to do for userspace mmap access.

dax_clear_poison() is at least ready to go today and does not preclude
adding the atomic and finer grained support later.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2021-11-04 16:08 UTC|newest]

Thread overview: 129+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21  0:10 [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Jane Chu
2021-10-21  0:10 ` [dm-devel] " Jane Chu
2021-10-21  0:10 ` [PATCH 1/6] dax: introduce RWF_RECOVERY_DATA flag to preadv2() and pwritev2() Jane Chu
2021-10-21  0:10   ` [dm-devel] " Jane Chu
2021-10-21  0:10 ` [PATCH 2/6] dax: prepare dax_direct_access() API with DAXDEV_F_RECOVERY flag Jane Chu
2021-10-21  0:10   ` [dm-devel] " Jane Chu
2021-10-21 11:20   ` Christoph Hellwig
2021-10-21 11:20     ` [dm-devel] " Christoph Hellwig
2021-10-21 18:19     ` Jane Chu
2021-10-21 18:19       ` [dm-devel] " Jane Chu
2021-10-21  0:10 ` [PATCH 3/6] pmem: pmem_dax_direct_access() to honor the " Jane Chu
2021-10-21  0:10   ` [dm-devel] " Jane Chu
2021-10-21 11:23   ` Christoph Hellwig
2021-10-21 11:23     ` [dm-devel] " Christoph Hellwig
2021-10-21 18:24     ` Jane Chu
2021-10-21 18:24       ` [dm-devel] " Jane Chu
2021-10-21  0:10 ` [PATCH 4/6] dm,dax,pmem: prepare dax_copy_to/from_iter() APIs with DAXDEV_F_RECOVERY Jane Chu
2021-10-21  0:10   ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Jane Chu
2021-10-21 11:27   ` [PATCH 4/6] dm,dax,pmem: " Christoph Hellwig
2021-10-21 11:27     ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Christoph Hellwig
2021-10-22  0:49     ` [PATCH 4/6] dm,dax,pmem: " Jane Chu
2021-10-22  0:49       ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Jane Chu
2021-10-22  1:41       ` correction: Re: [PATCH 4/6] dm,dax,pmem: " Jane Chu
2021-10-22  1:41         ` [dm-devel] correction: Re: [PATCH 4/6] dm, dax, pmem: " Jane Chu
2021-10-22  5:33       ` [PATCH 4/6] dm,dax,pmem: " Christoph Hellwig
2021-10-22  5:33         ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Christoph Hellwig
2021-10-22 20:30         ` [PATCH 4/6] dm,dax,pmem: " Jane Chu
2021-10-22 20:30           ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Jane Chu
2021-10-21  0:10 ` [PATCH 5/6] dax,pmem: Add data recovery feature to pmem_copy_to/from_iter() Jane Chu
2021-10-21  0:10   ` [dm-devel] [PATCH 5/6] dax, pmem: " Jane Chu
2021-10-21 11:28   ` [PATCH 5/6] dax,pmem: " Christoph Hellwig
2021-10-21 11:28     ` [dm-devel] [PATCH 5/6] dax, pmem: " Christoph Hellwig
2021-10-22  0:58     ` [PATCH 5/6] dax,pmem: " Jane Chu
2021-10-22  0:58       ` [dm-devel] [PATCH 5/6] dax, pmem: " Jane Chu
2021-10-22  8:03   ` kernel test robot
2021-10-22  8:03     ` kernel test robot
2021-10-26 10:21   ` [PATCH 5/6] dax,pmem: " kernel test robot
2021-10-26 10:21     ` [PATCH 5/6] dax, pmem: " kernel test robot
2021-10-26 10:21     ` [dm-devel] " kernel test robot
2021-10-21  0:10 ` [PATCH 6/6] dm: Ensure dm honors DAXDEV_F_RECOVERY flag on dax only Jane Chu
2021-10-21  0:10   ` [dm-devel] " Jane Chu
2021-10-21 11:31 ` [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Christoph Hellwig
2021-10-21 11:31   ` Christoph Hellwig
2021-10-22  1:37   ` Jane Chu
2021-10-22  1:37     ` Jane Chu
2021-10-22  1:58     ` Darrick J. Wong
2021-10-22  1:58       ` Darrick J. Wong
2021-10-22  5:38       ` Christoph Hellwig
2021-10-22  5:38         ` Christoph Hellwig
2021-10-22  5:36     ` Christoph Hellwig
2021-10-22  5:36       ` Christoph Hellwig
2021-10-22 20:52       ` Jane Chu
2021-10-22 20:52         ` Jane Chu
2021-10-27  6:49         ` Christoph Hellwig
2021-10-27  6:49           ` Christoph Hellwig
2021-10-28  0:24           ` Darrick J. Wong
2021-10-28  0:24             ` Darrick J. Wong
2021-10-28 22:59             ` Dave Chinner
2021-10-28 22:59               ` Dave Chinner
2021-10-29 11:46               ` Pavel Begunkov
2021-10-29 11:46                 ` Pavel Begunkov
2021-10-29 16:57                 ` Darrick J. Wong
2021-10-29 16:57                   ` Darrick J. Wong
2021-10-29 19:23                   ` Pavel Begunkov
2021-10-29 19:23                     ` Pavel Begunkov
2021-10-29 20:08                     ` Darrick J. Wong
2021-10-29 20:08                       ` Darrick J. Wong
2021-10-31 13:27                       ` Pavel Begunkov
2021-10-31 13:27                         ` Pavel Begunkov
2021-10-29 18:53                 ` Jane Chu
2021-10-29 18:53                   ` Jane Chu
2021-10-29 22:32                 ` Dave Chinner
2021-10-29 22:32                   ` Dave Chinner
2021-10-31 13:19                   ` Pavel Begunkov
2021-10-31 13:19                     ` Pavel Begunkov
2021-11-01  2:31                     ` Matthew Wilcox
2021-11-01  2:31                       ` Matthew Wilcox
2021-11-02  6:18             ` Christoph Hellwig
2021-11-02  6:18               ` Christoph Hellwig
2021-11-02 19:57               ` Dan Williams
2021-11-02 19:57                 ` Dan Williams
2021-11-03 16:58                 ` Christoph Hellwig
2021-11-03 16:58                   ` Christoph Hellwig
2021-11-03 20:33                   ` Dan Williams
2021-11-03 20:33                     ` Dan Williams
2021-11-04  8:30                     ` Christoph Hellwig
2021-11-04  8:30                       ` Christoph Hellwig
2021-11-04 12:29                       ` Matthew Wilcox
2021-11-04 12:29                         ` Matthew Wilcox
2021-11-04 16:24                       ` Dan Williams
2021-11-04 16:24                         ` Dan Williams
2021-11-04 17:43                         ` Christoph Hellwig
2021-11-04 17:43                           ` Christoph Hellwig
2021-11-04 17:50                           ` Dan Williams
2021-11-04 17:50                             ` Dan Williams
2021-11-04 18:05                           ` Matthew Wilcox
2021-11-04 18:05                             ` Matthew Wilcox
2021-11-04 18:33                         ` Jane Chu
2021-11-04 18:33                           ` Jane Chu
2021-11-04 19:00                           ` Dan Williams
2021-11-04 19:00                             ` Dan Williams
2021-11-04 20:27                             ` Jane Chu
2021-11-04 20:27                               ` Jane Chu
2021-11-05  0:46                               ` Dan Williams
2021-11-05  0:46                                 ` Dan Williams
2021-11-05  1:35                                 ` Dan Williams
2021-11-05  1:35                                   ` Dan Williams
2021-11-05  5:56                             ` Christoph Hellwig
2021-11-05  5:56                               ` Christoph Hellwig
2021-11-03 18:09               ` Jane Chu
2021-11-03 18:09                 ` Jane Chu
2021-11-04  6:21                 ` Dan Williams
2021-11-04  6:21                   ` Dan Williams
2021-11-04  8:36                   ` Christoph Hellwig
2021-11-04  8:36                     ` Christoph Hellwig
2021-11-04 16:08                     ` Dan Williams [this message]
2021-11-04 16:08                       ` Dan Williams
2021-11-04 17:46                       ` Christoph Hellwig
2021-11-04 17:46                         ` Christoph Hellwig
2021-11-04  8:21                 ` Christoph Hellwig
2021-11-04  8:21                   ` Christoph Hellwig
2021-11-02 16:12             ` Dan Williams
2021-11-02 16:12               ` Dan Williams
2021-11-02 16:03           ` Dan Williams
2021-11-02 16:03             ` Dan Williams
2021-11-03 16:53             ` Christoph Hellwig
2021-11-03 16:53               ` Christoph Hellwig
2021-11-06  7:41             ` Lukas Straub
2021-11-06  7:41               ` Lukas Straub

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4jaCj=qDw-MHEcYjVGHYGvX8wbJ_d3kv5nnv7agHnMViQ@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=agk@redhat.com \
    --cc=dave.jiang@intel.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=snitzer@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.