From: Jane Chu <jane.chu@oracle.com>
To: david@fromorbit.com, djwong@kernel.org, dan.j.williams@intel.com,
hch@infradead.org, vishal.l.verma@intel.com,
dave.jiang@intel.com, agk@redhat.com, snitzer@redhat.com,
dm-devel@redhat.com, ira.weiny@intel.com, willy@infradead.org,
vgoyal@redhat.com, linux-fsdevel@vger.kernel.org,
nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org,
linux-xfs@vger.kernel.org
Subject: [PATCH 5/6] dax,pmem: Add data recovery feature to pmem_copy_to/from_iter()
Date: Wed, 20 Oct 2021 18:10:58 -0600 [thread overview]
Message-ID: <20211021001059.438843-6-jane.chu@oracle.com> (raw)
In-Reply-To: <20211021001059.438843-1-jane.chu@oracle.com>
When DAXDEV_F_RECOVERY flag is set, pmem_copy_to_iter() shall read
as much data as possible up till the first poisoned page is
encountered, and pmem_copy_from_iter() shall try to clear poison(s)
within the page aligned range prior to writing.
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
drivers/nvdimm/pmem.c | 72 ++++++++++++++++++++++++++++++++++++++++---
fs/dax.c | 5 +++
2 files changed, 72 insertions(+), 5 deletions(-)
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index e2a1c35108cd..c456f84d2f6f 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -305,21 +305,83 @@ static long pmem_dax_direct_access(struct dax_device *dax_dev,
}
/*
- * Use the 'no check' versions of copy_from_iter_flushcache() and
- * copy_mc_to_iter() to bypass HARDENED_USERCOPY overhead. Bounds
- * checking, both file offset and device offset, is handled by
- * dax_iomap_actor()
+ * Even though the 'no check' versions of copy_from_iter_flushcache()
+ * and copy_mc_to_iter() are used to bypass HARDENED_USERCOPY overhead,
+ * 'read'/'write' aren't always safe when poison is consumed. They happen
+ * to be safe because the 'read'/'write' range has been guaranteed
+ * be free of poison(s) by a prior call to dax_direct_access() on the
+ * caller stack.
+ * However with the introduction of DAXDEV_F_RECOVERY, the 'read'/'write'
+ * range may contain poison(s), so the functions perform explicit check
+ * on poison, and 'read' end up fetching only non-poisoned page(s) up
+ * till the first poison is encountered while 'write' require the range
+ * is page aligned in order to restore the poisoned page's memory type
+ * back to "rw" after clearing the poison(s).
+ * In the event of poison related failure, (size_t) -EIO is returned and
+ * caller may check the return value after casting it to (ssize_t).
*/
static size_t pmem_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff,
void *addr, size_t bytes, struct iov_iter *i, unsigned long flags)
{
+ phys_addr_t pmem_off;
+ size_t len, lead_off;
+ struct pmem_device *pmem = dax_get_private(dax_dev);
+ struct device *dev = pmem->bb.dev;
+
+ if (flags & DAXDEV_F_RECOVERY) {
+ lead_off = (unsigned long)addr & ~PAGE_MASK;
+ len = PFN_PHYS(PFN_UP(lead_off + bytes));
+ if (is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512, len)) {
+ if (lead_off || !(PAGE_ALIGNED(bytes))) {
+ dev_warn(dev, "Found poison, but addr(%p) and/or bytes(%#lx) not page aligned\n",
+ addr, bytes);
+ return (size_t) -EIO;
+ }
+ pmem_off = PFN_PHYS(pgoff) + pmem->data_offset;
+ if (pmem_clear_poison(pmem, pmem_off, bytes) !=
+ BLK_STS_OK)
+ return (size_t) -EIO;
+ }
+ }
+
return _copy_from_iter_flushcache(addr, bytes, i);
}
static size_t pmem_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff,
void *addr, size_t bytes, struct iov_iter *i, unsigned long flags)
{
- return _copy_mc_to_iter(addr, bytes, i);
+ int num_bad;
+ size_t len, lead_off;
+ unsigned long bad_pfn;
+ bool bad_pmem = false;
+ size_t adj_len = bytes;
+ sector_t sector, first_bad;
+ struct pmem_device *pmem = dax_get_private(dax_dev);
+ struct device *dev = pmem->bb.dev;
+
+ if (flags & DAXDEV_F_RECOVERY) {
+ sector = PFN_PHYS(pgoff) / 512;
+ lead_off = (unsigned long)addr & ~PAGE_MASK;
+ len = PFN_PHYS(PFN_UP(lead_off + bytes));
+ if (pmem->bb.count)
+ bad_pmem = !!badblocks_check(&pmem->bb, sector,
+ len / 512, &first_bad, &num_bad);
+ if (bad_pmem) {
+ bad_pfn = PHYS_PFN(first_bad * 512);
+ if (bad_pfn == pgoff) {
+ dev_warn(dev, "Found poison in page: pgoff(%#lx)\n",
+ pgoff);
+ return -EIO;
+ }
+ adj_len = PFN_PHYS(bad_pfn - pgoff) - lead_off;
+ dev_WARN_ONCE(dev, (adj_len > bytes),
+ "out-of-range first_bad?");
+ }
+ if (adj_len == 0)
+ return (size_t) -EIO;
+ }
+
+ return _copy_mc_to_iter(addr, adj_len, i);
}
static const struct dax_operations pmem_dax_ops = {
diff --git a/fs/dax.c b/fs/dax.c
index 69433c6cd6c4..b9286668dc46 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1246,6 +1246,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
xfer = dax_copy_to_iter(dax_dev, pgoff, kaddr,
map_len, iter, dax_flag);
+ if ((ssize_t)xfer == -EIO) {
+ ret = -EIO;
+ break;
+ }
+
pos += xfer;
length -= xfer;
done += xfer;
--
2.18.4
next prev parent reply other threads:[~2021-10-21 0:12 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-21 0:10 [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Jane Chu
2021-10-21 0:10 ` [PATCH 1/6] dax: introduce RWF_RECOVERY_DATA flag to preadv2() and pwritev2() Jane Chu
2021-10-21 0:10 ` [PATCH 2/6] dax: prepare dax_direct_access() API with DAXDEV_F_RECOVERY flag Jane Chu
2021-10-21 11:20 ` Christoph Hellwig
2021-10-21 18:19 ` Jane Chu
2021-10-21 0:10 ` [PATCH 3/6] pmem: pmem_dax_direct_access() to honor the " Jane Chu
2021-10-21 11:23 ` Christoph Hellwig
2021-10-21 18:24 ` Jane Chu
2021-10-21 0:10 ` [PATCH 4/6] dm,dax,pmem: prepare dax_copy_to/from_iter() APIs with DAXDEV_F_RECOVERY Jane Chu
2021-10-21 11:27 ` Christoph Hellwig
2021-10-22 0:49 ` Jane Chu
2021-10-22 1:41 ` correction: " Jane Chu
2021-10-22 5:33 ` Christoph Hellwig
2021-10-22 20:30 ` Jane Chu
2021-10-21 0:10 ` Jane Chu [this message]
2021-10-21 11:28 ` [PATCH 5/6] dax,pmem: Add data recovery feature to pmem_copy_to/from_iter() Christoph Hellwig
2021-10-22 0:58 ` Jane Chu
2021-10-21 0:10 ` [PATCH 6/6] dm: Ensure dm honors DAXDEV_F_RECOVERY flag on dax only Jane Chu
2021-10-21 11:31 ` [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Christoph Hellwig
2021-10-22 1:37 ` Jane Chu
2021-10-22 1:58 ` Darrick J. Wong
2021-10-22 5:38 ` Christoph Hellwig
2021-10-22 5:36 ` Christoph Hellwig
2021-10-22 20:52 ` Jane Chu
2021-10-27 6:49 ` Christoph Hellwig
2021-10-28 0:24 ` Darrick J. Wong
2021-10-28 22:59 ` Dave Chinner
2021-10-29 11:46 ` Pavel Begunkov
2021-10-29 16:57 ` Darrick J. Wong
2021-10-29 19:23 ` Pavel Begunkov
2021-10-29 20:08 ` Darrick J. Wong
2021-10-31 13:27 ` Pavel Begunkov
2021-10-29 18:53 ` Jane Chu
2021-10-29 22:32 ` Dave Chinner
2021-10-31 13:19 ` Pavel Begunkov
2021-11-01 2:31 ` Matthew Wilcox
2021-11-02 6:18 ` Christoph Hellwig
2021-11-02 19:57 ` Dan Williams
2021-11-03 16:58 ` Christoph Hellwig
2021-11-03 20:33 ` Dan Williams
2021-11-04 8:30 ` Christoph Hellwig
2021-11-04 12:29 ` Matthew Wilcox
2021-11-04 16:24 ` Dan Williams
2021-11-04 17:43 ` Christoph Hellwig
2021-11-04 17:50 ` Dan Williams
2021-11-04 18:05 ` Matthew Wilcox
2021-11-04 18:33 ` Jane Chu
2021-11-04 19:00 ` Dan Williams
2021-11-04 20:27 ` Jane Chu
2021-11-05 0:46 ` Dan Williams
2021-11-05 1:35 ` Dan Williams
2021-11-05 5:56 ` Christoph Hellwig
2021-11-03 18:09 ` Jane Chu
2021-11-04 6:21 ` Dan Williams
2021-11-04 8:36 ` Christoph Hellwig
2021-11-04 16:08 ` Dan Williams
2021-11-04 17:46 ` Christoph Hellwig
2021-11-04 8:21 ` Christoph Hellwig
2021-11-02 16:12 ` Dan Williams
2021-11-02 16:03 ` Dan Williams
2021-11-03 16:53 ` Christoph Hellwig
2021-11-06 7:41 ` Lukas Straub
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211021001059.438843-6-jane.chu@oracle.com \
--to=jane.chu@oracle.com \
--cc=agk@redhat.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=dm-devel@redhat.com \
--cc=hch@infradead.org \
--cc=ira.weiny@intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=snitzer@redhat.com \
--cc=vgoyal@redhat.com \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).