From: Dan Williams <dan.j.williams@intel.com> To: Jane Chu <jane.chu@oracle.com> Cc: Christoph Hellwig <hch@infradead.org>, "Darrick J. Wong" <djwong@kernel.org>, "david@fromorbit.com" <david@fromorbit.com>, "vishal.l.verma@intel.com" <vishal.l.verma@intel.com>, "dave.jiang@intel.com" <dave.jiang@intel.com>, "agk@redhat.com" <agk@redhat.com>, "snitzer@redhat.com" <snitzer@redhat.com>, "dm-devel@redhat.com" <dm-devel@redhat.com>, "ira.weiny@intel.com" <ira.weiny@intel.com>, "willy@infradead.org" <willy@infradead.org>, "vgoyal@redhat.com" <vgoyal@redhat.com>, "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>, "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org> Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Date: Thu, 4 Nov 2021 12:00:12 -0700 [thread overview] Message-ID: <CAPcyv4hJjcy2TnOv-Y5=MUMHeDdN-BCH4d0xC-pFGcHXEU_ZEw@mail.gmail.com> (raw) In-Reply-To: <6d21ece1-0201-54f2-ec5a-ae2f873d46a3@oracle.com> On Thu, Nov 4, 2021 at 11:34 AM Jane Chu <jane.chu@oracle.com> wrote: > > Thanks for the enlightening discussion here, it's so helpful! > > Please allow me to recap what I've caught up so far - > > 1. recovery write at page boundary due to NP setting in poisoned > page to prevent undesirable prefetching > 2. single interface to perform 3 tasks: > { clear-poison, update error-list, write } > such as an API in pmem driver. > For CPUs that support MOVEDIR64B, the 'clear-poison' and 'write' > task can be combined (would need something different from the > existing _copy_mcsafe though) and 'update error-list' follows > closely behind; > For CPUs that rely on firmware call to clear posion, the existing > pmem_clear_poison() can be used, followed by the 'write' task. > 3. if user isn't given RWF_RECOVERY_FLAG flag, then dax recovery > would be automatic for a write if range is page aligned; > otherwise, the write fails with EIO as usual. > Also, user mustn't have punched out the poisoned page in which > case poison repairing will be a lot more complicated. > 4. desirable to fetch as much data as possible from a poisoned range. > > If this understanding is in the right direction, then I'd like to > propose below changes to > dax_direct_access(), dax_copy_to/from_iter(), pmem_copy_to/from_iter() > and the dm layer copy_to/from_iter, dax_iomap_iter(). > > 1. dax_iomap_iter() rely on dax_direct_access() to decide whether there > is likely media error: if the API without DAX_F_RECOVERY returns > -EIO, then switch to recovery-read/write code. In recovery code, > supply DAX_F_RECOVERY to dax_direct_access() in order to obtain > 'kaddr', and then call dax_copy_to/from_iter() with DAX_F_RECOVERY. I like it. It allows for an atomic write+clear implementation on capable platforms and coordinates with potentially unmapped pages. The best of both worlds from the dax_clear_poison() proposal and my "take a fault and do a slow-path copy". > 2. the _copy_to/from_iter implementation would be largely the same > as in my recent patch, but some changes in Christoph's > 'dax-devirtualize' maybe kept, such as DAX_F_VIRTUAL, obviously > virtual devices don't have the ability to clear poison, so no need > to complicate them. And this also means that not every endpoint > dax device has to provide dax_op.copy_to/from_iter, they may use the > default. Did I miss this series or are you talking about this one? https://lore.kernel.org/all/20211018044054.1779424-1-hch@lst.de/ > I'm not sure about nova and others, if they use different 'write' other > than via iomap, does that mean there will be need for a new set of > dax_op for their read/write? No, they're out-of-tree they'll adjust to the same interface that xfs and ext4 are using when/if they go upstream. > the 3-in-1 binding would always be > required though. Maybe that'll be an ongoing discussion? Yeah, let's cross that bridge when we come to it. > Comments? Suggestions? It sounds great to me!
WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com> To: Jane Chu <jane.chu@oracle.com> Cc: "nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>, "dave.jiang@intel.com" <dave.jiang@intel.com>, "snitzer@redhat.com" <snitzer@redhat.com>, "Darrick J. Wong" <djwong@kernel.org>, "david@fromorbit.com" <david@fromorbit.com>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "willy@infradead.org" <willy@infradead.org>, Christoph Hellwig <hch@infradead.org>, "dm-devel@redhat.com" <dm-devel@redhat.com>, "vgoyal@redhat.com" <vgoyal@redhat.com>, "vishal.l.verma@intel.com" <vishal.l.verma@intel.com>, "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>, "ira.weiny@intel.com" <ira.weiny@intel.com>, "linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>, "agk@redhat.com" <agk@redhat.com> Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Date: Thu, 4 Nov 2021 12:00:12 -0700 [thread overview] Message-ID: <CAPcyv4hJjcy2TnOv-Y5=MUMHeDdN-BCH4d0xC-pFGcHXEU_ZEw@mail.gmail.com> (raw) In-Reply-To: <6d21ece1-0201-54f2-ec5a-ae2f873d46a3@oracle.com> On Thu, Nov 4, 2021 at 11:34 AM Jane Chu <jane.chu@oracle.com> wrote: > > Thanks for the enlightening discussion here, it's so helpful! > > Please allow me to recap what I've caught up so far - > > 1. recovery write at page boundary due to NP setting in poisoned > page to prevent undesirable prefetching > 2. single interface to perform 3 tasks: > { clear-poison, update error-list, write } > such as an API in pmem driver. > For CPUs that support MOVEDIR64B, the 'clear-poison' and 'write' > task can be combined (would need something different from the > existing _copy_mcsafe though) and 'update error-list' follows > closely behind; > For CPUs that rely on firmware call to clear posion, the existing > pmem_clear_poison() can be used, followed by the 'write' task. > 3. if user isn't given RWF_RECOVERY_FLAG flag, then dax recovery > would be automatic for a write if range is page aligned; > otherwise, the write fails with EIO as usual. > Also, user mustn't have punched out the poisoned page in which > case poison repairing will be a lot more complicated. > 4. desirable to fetch as much data as possible from a poisoned range. > > If this understanding is in the right direction, then I'd like to > propose below changes to > dax_direct_access(), dax_copy_to/from_iter(), pmem_copy_to/from_iter() > and the dm layer copy_to/from_iter, dax_iomap_iter(). > > 1. dax_iomap_iter() rely on dax_direct_access() to decide whether there > is likely media error: if the API without DAX_F_RECOVERY returns > -EIO, then switch to recovery-read/write code. In recovery code, > supply DAX_F_RECOVERY to dax_direct_access() in order to obtain > 'kaddr', and then call dax_copy_to/from_iter() with DAX_F_RECOVERY. I like it. It allows for an atomic write+clear implementation on capable platforms and coordinates with potentially unmapped pages. The best of both worlds from the dax_clear_poison() proposal and my "take a fault and do a slow-path copy". > 2. the _copy_to/from_iter implementation would be largely the same > as in my recent patch, but some changes in Christoph's > 'dax-devirtualize' maybe kept, such as DAX_F_VIRTUAL, obviously > virtual devices don't have the ability to clear poison, so no need > to complicate them. And this also means that not every endpoint > dax device has to provide dax_op.copy_to/from_iter, they may use the > default. Did I miss this series or are you talking about this one? https://lore.kernel.org/all/20211018044054.1779424-1-hch@lst.de/ > I'm not sure about nova and others, if they use different 'write' other > than via iomap, does that mean there will be need for a new set of > dax_op for their read/write? No, they're out-of-tree they'll adjust to the same interface that xfs and ext4 are using when/if they go upstream. > the 3-in-1 binding would always be > required though. Maybe that'll be an ongoing discussion? Yeah, let's cross that bridge when we come to it. > Comments? Suggestions? It sounds great to me! -- dm-devel mailing list dm-devel@redhat.com https://listman.redhat.com/mailman/listinfo/dm-devel
next prev parent reply other threads:[~2021-11-04 19:00 UTC|newest] Thread overview: 129+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-10-21 0:10 [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Jane Chu 2021-10-21 0:10 ` [dm-devel] " Jane Chu 2021-10-21 0:10 ` [PATCH 1/6] dax: introduce RWF_RECOVERY_DATA flag to preadv2() and pwritev2() Jane Chu 2021-10-21 0:10 ` [dm-devel] " Jane Chu 2021-10-21 0:10 ` [PATCH 2/6] dax: prepare dax_direct_access() API with DAXDEV_F_RECOVERY flag Jane Chu 2021-10-21 0:10 ` [dm-devel] " Jane Chu 2021-10-21 11:20 ` Christoph Hellwig 2021-10-21 11:20 ` [dm-devel] " Christoph Hellwig 2021-10-21 18:19 ` Jane Chu 2021-10-21 18:19 ` [dm-devel] " Jane Chu 2021-10-21 0:10 ` [PATCH 3/6] pmem: pmem_dax_direct_access() to honor the " Jane Chu 2021-10-21 0:10 ` [dm-devel] " Jane Chu 2021-10-21 11:23 ` Christoph Hellwig 2021-10-21 11:23 ` [dm-devel] " Christoph Hellwig 2021-10-21 18:24 ` Jane Chu 2021-10-21 18:24 ` [dm-devel] " Jane Chu 2021-10-21 0:10 ` [PATCH 4/6] dm,dax,pmem: prepare dax_copy_to/from_iter() APIs with DAXDEV_F_RECOVERY Jane Chu 2021-10-21 0:10 ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Jane Chu 2021-10-21 11:27 ` [PATCH 4/6] dm,dax,pmem: " Christoph Hellwig 2021-10-21 11:27 ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Christoph Hellwig 2021-10-22 0:49 ` [PATCH 4/6] dm,dax,pmem: " Jane Chu 2021-10-22 0:49 ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Jane Chu 2021-10-22 1:41 ` correction: Re: [PATCH 4/6] dm,dax,pmem: " Jane Chu 2021-10-22 1:41 ` [dm-devel] correction: Re: [PATCH 4/6] dm, dax, pmem: " Jane Chu 2021-10-22 5:33 ` [PATCH 4/6] dm,dax,pmem: " Christoph Hellwig 2021-10-22 5:33 ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Christoph Hellwig 2021-10-22 20:30 ` [PATCH 4/6] dm,dax,pmem: " Jane Chu 2021-10-22 20:30 ` [dm-devel] [PATCH 4/6] dm, dax, pmem: " Jane Chu 2021-10-21 0:10 ` [PATCH 5/6] dax,pmem: Add data recovery feature to pmem_copy_to/from_iter() Jane Chu 2021-10-21 0:10 ` [dm-devel] [PATCH 5/6] dax, pmem: " Jane Chu 2021-10-21 11:28 ` [PATCH 5/6] dax,pmem: " Christoph Hellwig 2021-10-21 11:28 ` [dm-devel] [PATCH 5/6] dax, pmem: " Christoph Hellwig 2021-10-22 0:58 ` [PATCH 5/6] dax,pmem: " Jane Chu 2021-10-22 0:58 ` [dm-devel] [PATCH 5/6] dax, pmem: " Jane Chu 2021-10-22 8:03 ` kernel test robot 2021-10-22 8:03 ` kernel test robot 2021-10-26 10:21 ` [PATCH 5/6] dax,pmem: " kernel test robot 2021-10-26 10:21 ` [PATCH 5/6] dax, pmem: " kernel test robot 2021-10-26 10:21 ` [dm-devel] " kernel test robot 2021-10-21 0:10 ` [PATCH 6/6] dm: Ensure dm honors DAXDEV_F_RECOVERY flag on dax only Jane Chu 2021-10-21 0:10 ` [dm-devel] " Jane Chu 2021-10-21 11:31 ` [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Christoph Hellwig 2021-10-21 11:31 ` Christoph Hellwig 2021-10-22 1:37 ` Jane Chu 2021-10-22 1:37 ` Jane Chu 2021-10-22 1:58 ` Darrick J. Wong 2021-10-22 1:58 ` Darrick J. Wong 2021-10-22 5:38 ` Christoph Hellwig 2021-10-22 5:38 ` Christoph Hellwig 2021-10-22 5:36 ` Christoph Hellwig 2021-10-22 5:36 ` Christoph Hellwig 2021-10-22 20:52 ` Jane Chu 2021-10-22 20:52 ` Jane Chu 2021-10-27 6:49 ` Christoph Hellwig 2021-10-27 6:49 ` Christoph Hellwig 2021-10-28 0:24 ` Darrick J. Wong 2021-10-28 0:24 ` Darrick J. Wong 2021-10-28 22:59 ` Dave Chinner 2021-10-28 22:59 ` Dave Chinner 2021-10-29 11:46 ` Pavel Begunkov 2021-10-29 11:46 ` Pavel Begunkov 2021-10-29 16:57 ` Darrick J. Wong 2021-10-29 16:57 ` Darrick J. Wong 2021-10-29 19:23 ` Pavel Begunkov 2021-10-29 19:23 ` Pavel Begunkov 2021-10-29 20:08 ` Darrick J. Wong 2021-10-29 20:08 ` Darrick J. Wong 2021-10-31 13:27 ` Pavel Begunkov 2021-10-31 13:27 ` Pavel Begunkov 2021-10-29 18:53 ` Jane Chu 2021-10-29 18:53 ` Jane Chu 2021-10-29 22:32 ` Dave Chinner 2021-10-29 22:32 ` Dave Chinner 2021-10-31 13:19 ` Pavel Begunkov 2021-10-31 13:19 ` Pavel Begunkov 2021-11-01 2:31 ` Matthew Wilcox 2021-11-01 2:31 ` Matthew Wilcox 2021-11-02 6:18 ` Christoph Hellwig 2021-11-02 6:18 ` Christoph Hellwig 2021-11-02 19:57 ` Dan Williams 2021-11-02 19:57 ` Dan Williams 2021-11-03 16:58 ` Christoph Hellwig 2021-11-03 16:58 ` Christoph Hellwig 2021-11-03 20:33 ` Dan Williams 2021-11-03 20:33 ` Dan Williams 2021-11-04 8:30 ` Christoph Hellwig 2021-11-04 8:30 ` Christoph Hellwig 2021-11-04 12:29 ` Matthew Wilcox 2021-11-04 12:29 ` Matthew Wilcox 2021-11-04 16:24 ` Dan Williams 2021-11-04 16:24 ` Dan Williams 2021-11-04 17:43 ` Christoph Hellwig 2021-11-04 17:43 ` Christoph Hellwig 2021-11-04 17:50 ` Dan Williams 2021-11-04 17:50 ` Dan Williams 2021-11-04 18:05 ` Matthew Wilcox 2021-11-04 18:05 ` Matthew Wilcox 2021-11-04 18:33 ` Jane Chu 2021-11-04 18:33 ` Jane Chu 2021-11-04 19:00 ` Dan Williams [this message] 2021-11-04 19:00 ` Dan Williams 2021-11-04 20:27 ` Jane Chu 2021-11-04 20:27 ` Jane Chu 2021-11-05 0:46 ` Dan Williams 2021-11-05 0:46 ` Dan Williams 2021-11-05 1:35 ` Dan Williams 2021-11-05 1:35 ` Dan Williams 2021-11-05 5:56 ` Christoph Hellwig 2021-11-05 5:56 ` Christoph Hellwig 2021-11-03 18:09 ` Jane Chu 2021-11-03 18:09 ` Jane Chu 2021-11-04 6:21 ` Dan Williams 2021-11-04 6:21 ` Dan Williams 2021-11-04 8:36 ` Christoph Hellwig 2021-11-04 8:36 ` Christoph Hellwig 2021-11-04 16:08 ` Dan Williams 2021-11-04 16:08 ` Dan Williams 2021-11-04 17:46 ` Christoph Hellwig 2021-11-04 17:46 ` Christoph Hellwig 2021-11-04 8:21 ` Christoph Hellwig 2021-11-04 8:21 ` Christoph Hellwig 2021-11-02 16:12 ` Dan Williams 2021-11-02 16:12 ` Dan Williams 2021-11-02 16:03 ` Dan Williams 2021-11-02 16:03 ` Dan Williams 2021-11-03 16:53 ` Christoph Hellwig 2021-11-03 16:53 ` Christoph Hellwig 2021-11-06 7:41 ` Lukas Straub 2021-11-06 7:41 ` Lukas Straub
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CAPcyv4hJjcy2TnOv-Y5=MUMHeDdN-BCH4d0xC-pFGcHXEU_ZEw@mail.gmail.com' \ --to=dan.j.williams@intel.com \ --cc=agk@redhat.com \ --cc=dave.jiang@intel.com \ --cc=david@fromorbit.com \ --cc=djwong@kernel.org \ --cc=dm-devel@redhat.com \ --cc=hch@infradead.org \ --cc=ira.weiny@intel.com \ --cc=jane.chu@oracle.com \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-xfs@vger.kernel.org \ --cc=nvdimm@lists.linux.dev \ --cc=snitzer@redhat.com \ --cc=vgoyal@redhat.com \ --cc=vishal.l.verma@intel.com \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.