All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/6] DAX poison recovery
@ 2022-03-19  6:28 ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

In this series, dax recovery code path is independent of that of
normal write. Competing dax recovery threads are serialized,
racing read threads are guaranteed not overlapping with the
recovery process.

In this phase, the recovery granularity is page, future patch
will explore recovery in finer granularity.

Changelog:
v5->v6:
  . per Christoph, move set{clear}_mce_nospec() inline functions out
    of include/linux/set_memory.h and into arch/x86/mm/pat/set_memory.c
    file, so that no need to export _set_memory_present().
  . per Christoph, ratelimit warning message in pmem_do_write()
  . per both Christoph and Dan, switch back to adding a flag to
    dax_direct_access() instead of embedding the flag in kvaddr
  . suggestions from Christoph for improving code structure and
    readability
  . per Dan, add .recovery_write to dev_pagemap.ops instead of adding
    it to dax_operations, such that, the DM layer doesn't need to be
    involved explicitly in dax recoovery write
  . per Dan, is_bad_pmem() takes a seqlock, so no need to place it
    under recovery_lock. 
  Many thanks for both reviewers!
  
v4->v5:
  Fixed build errors reported by kernel test robot

v3->v4:
  Rebased to v5.17-rc1-81-g0280e3c58f92

References:
v4 https://lore.kernel.org/lkml/20220126211116.860012-1-jane.chu@oracle.com/T/
v3 https://lkml.org/lkml/2022/1/11/900
v2 https://lore.kernel.org/all/20211106011638.2613039-1-jane.chu@oracle.com/
Disussions about marking poisoned page as 'np'
https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/

Jane Chu (6):
  x86/mm: fix comment
  x86/mce: relocate set{clear}_mce_nospec() functions
  mce: fix set_mce_nospec to always unmap the whole page
  dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  pmem: refactor pmem_clear_poison()
  pmem: implement pmem_recovery_write()

 arch/x86/include/asm/set_memory.h |  52 ---------
 arch/x86/kernel/cpu/mce/core.c    |   6 +-
 arch/x86/mm/pat/set_memory.c      |  47 +++++++-
 drivers/dax/super.c               |  23 +++-
 drivers/md/dm-linear.c            |   4 +-
 drivers/md/dm-log-writes.c        |   5 +-
 drivers/md/dm-stripe.c            |   4 +-
 drivers/md/dm-target.c            |   2 +-
 drivers/md/dm-writecache.c        |   5 +-
 drivers/md/dm.c                   |   5 +-
 drivers/nvdimm/pmem.c             | 182 +++++++++++++++++++++---------
 drivers/nvdimm/pmem.h             |   3 +-
 drivers/s390/block/dcssblk.c      |   4 +-
 fs/dax.c                          |  32 +++++-
 fs/fuse/dax.c                     |   4 +-
 include/linux/dax.h               |  12 +-
 include/linux/device-mapper.h     |   2 +-
 include/linux/memremap.h          |   7 ++
 include/linux/set_memory.h        |  11 +-
 tools/testing/nvdimm/pmem-dax.c   |   2 +-
 20 files changed, 269 insertions(+), 143 deletions(-)

-- 
2.18.4


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 0/6] DAX poison recovery
@ 2022-03-19  6:28 ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

In this series, dax recovery code path is independent of that of
normal write. Competing dax recovery threads are serialized,
racing read threads are guaranteed not overlapping with the
recovery process.

In this phase, the recovery granularity is page, future patch
will explore recovery in finer granularity.

Changelog:
v5->v6:
  . per Christoph, move set{clear}_mce_nospec() inline functions out
    of include/linux/set_memory.h and into arch/x86/mm/pat/set_memory.c
    file, so that no need to export _set_memory_present().
  . per Christoph, ratelimit warning message in pmem_do_write()
  . per both Christoph and Dan, switch back to adding a flag to
    dax_direct_access() instead of embedding the flag in kvaddr
  . suggestions from Christoph for improving code structure and
    readability
  . per Dan, add .recovery_write to dev_pagemap.ops instead of adding
    it to dax_operations, such that, the DM layer doesn't need to be
    involved explicitly in dax recoovery write
  . per Dan, is_bad_pmem() takes a seqlock, so no need to place it
    under recovery_lock. 
  Many thanks for both reviewers!
  
v4->v5:
  Fixed build errors reported by kernel test robot

v3->v4:
  Rebased to v5.17-rc1-81-g0280e3c58f92

References:
v4 https://lore.kernel.org/lkml/20220126211116.860012-1-jane.chu@oracle.com/T/
v3 https://lkml.org/lkml/2022/1/11/900
v2 https://lore.kernel.org/all/20211106011638.2613039-1-jane.chu@oracle.com/
Disussions about marking poisoned page as 'np'
https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/

Jane Chu (6):
  x86/mm: fix comment
  x86/mce: relocate set{clear}_mce_nospec() functions
  mce: fix set_mce_nospec to always unmap the whole page
  dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  pmem: refactor pmem_clear_poison()
  pmem: implement pmem_recovery_write()

 arch/x86/include/asm/set_memory.h |  52 ---------
 arch/x86/kernel/cpu/mce/core.c    |   6 +-
 arch/x86/mm/pat/set_memory.c      |  47 +++++++-
 drivers/dax/super.c               |  23 +++-
 drivers/md/dm-linear.c            |   4 +-
 drivers/md/dm-log-writes.c        |   5 +-
 drivers/md/dm-stripe.c            |   4 +-
 drivers/md/dm-target.c            |   2 +-
 drivers/md/dm-writecache.c        |   5 +-
 drivers/md/dm.c                   |   5 +-
 drivers/nvdimm/pmem.c             | 182 +++++++++++++++++++++---------
 drivers/nvdimm/pmem.h             |   3 +-
 drivers/s390/block/dcssblk.c      |   4 +-
 fs/dax.c                          |  32 +++++-
 fs/fuse/dax.c                     |   4 +-
 include/linux/dax.h               |  12 +-
 include/linux/device-mapper.h     |   2 +-
 include/linux/memremap.h          |   7 ++
 include/linux/set_memory.h        |  11 +-
 tools/testing/nvdimm/pmem-dax.c   |   2 +-
 20 files changed, 269 insertions(+), 143 deletions(-)

-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v6 1/6] x86/mm: fix comment
  2022-03-19  6:28 ` [dm-devel] " Jane Chu
@ 2022-03-19  6:28   ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

There is no _set_memory_prot internal helper, while coming across
the code, might as well fix the comment.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 arch/x86/mm/pat/set_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index b4072115c8ef..042cfac6272b 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1816,7 +1816,7 @@ static inline int cpa_clear_pages_array(struct page **pages, int numpages,
 }
 
 /*
- * _set_memory_prot is an internal helper for callers that have been passed
+ * __set_memory_prot is an internal helper for callers that have been passed
  * a pgprot_t value from upper layers and a reservation has already been taken.
  * If you want to set the pgprot to a specific page protocol, use the
  * set_memory_xx() functions.
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 1/6] x86/mm: fix comment
@ 2022-03-19  6:28   ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

There is no _set_memory_prot internal helper, while coming across
the code, might as well fix the comment.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 arch/x86/mm/pat/set_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index b4072115c8ef..042cfac6272b 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1816,7 +1816,7 @@ static inline int cpa_clear_pages_array(struct page **pages, int numpages,
 }
 
 /*
- * _set_memory_prot is an internal helper for callers that have been passed
+ * __set_memory_prot is an internal helper for callers that have been passed
  * a pgprot_t value from upper layers and a reservation has already been taken.
  * If you want to set the pgprot to a specific page protocol, use the
  * set_memory_xx() functions.
-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-19  6:28 ` [dm-devel] " Jane Chu
@ 2022-03-19  6:28   ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Relocate the twin mce functions to arch/x86/mm/pat/set_memory.c
file where they belong.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 arch/x86/include/asm/set_memory.h | 52 -------------------------------
 arch/x86/mm/pat/set_memory.c      | 48 ++++++++++++++++++++++++++++
 include/linux/set_memory.h        |  9 +++---
 3 files changed, 53 insertions(+), 56 deletions(-)

diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index ff0f2d90338a..648be0bd20df 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -88,56 +88,4 @@ void notify_range_enc_status_changed(unsigned long vaddr, int npages, bool enc);
 
 extern int kernel_set_to_readonly;
 
-#ifdef CONFIG_X86_64
-/*
- * Prevent speculative access to the page by either unmapping
- * it (if we do not require access to any part of the page) or
- * marking it uncacheable (if we want to try to retrieve data
- * from non-poisoned lines in the page).
- */
-static inline int set_mce_nospec(unsigned long pfn, bool unmap)
-{
-	unsigned long decoy_addr;
-	int rc;
-
-	/* SGX pages are not in the 1:1 map */
-	if (arch_is_platform_page(pfn << PAGE_SHIFT))
-		return 0;
-	/*
-	 * We would like to just call:
-	 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
-	 * but doing that would radically increase the odds of a
-	 * speculative access to the poison page because we'd have
-	 * the virtual address of the kernel 1:1 mapping sitting
-	 * around in registers.
-	 * Instead we get tricky.  We create a non-canonical address
-	 * that looks just like the one we want, but has bit 63 flipped.
-	 * This relies on set_memory_XX() properly sanitizing any __pa()
-	 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
-	 */
-	decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
-
-	if (unmap)
-		rc = set_memory_np(decoy_addr, 1);
-	else
-		rc = set_memory_uc(decoy_addr, 1);
-	if (rc)
-		pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
-	return rc;
-}
-#define set_mce_nospec set_mce_nospec
-
-/* Restore full speculative operation to the pfn. */
-static inline int clear_mce_nospec(unsigned long pfn)
-{
-	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
-}
-#define clear_mce_nospec clear_mce_nospec
-#else
-/*
- * Few people would run a 32-bit kernel on a machine that supports
- * recoverable errors because they have too much memory to boot 32-bit.
- */
-#endif
-
 #endif /* _ASM_X86_SET_MEMORY_H */
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 042cfac6272b..9abc6077d768 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1925,6 +1925,54 @@ int set_memory_wb(unsigned long addr, int numpages)
 }
 EXPORT_SYMBOL(set_memory_wb);
 
+#ifdef CONFIG_X86_64
+/*
+ * Prevent speculative access to the page by either unmapping
+ * it (if we do not require access to any part of the page) or
+ * marking it uncacheable (if we want to try to retrieve data
+ * from non-poisoned lines in the page).
+ */
+int set_mce_nospec(unsigned long pfn, bool unmap)
+{
+	unsigned long decoy_addr;
+	int rc;
+
+	/* SGX pages are not in the 1:1 map */
+	if (arch_is_platform_page(pfn << PAGE_SHIFT))
+		return 0;
+	/*
+	 * We would like to just call:
+	 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
+	 * but doing that would radically increase the odds of a
+	 * speculative access to the poison page because we'd have
+	 * the virtual address of the kernel 1:1 mapping sitting
+	 * around in registers.
+	 * Instead we get tricky.  We create a non-canonical address
+	 * that looks just like the one we want, but has bit 63 flipped.
+	 * This relies on set_memory_XX() properly sanitizing any __pa()
+	 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
+	 */
+	decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
+
+	if (unmap)
+		rc = set_memory_np(decoy_addr, 1);
+	else
+		rc = set_memory_uc(decoy_addr, 1);
+	if (rc)
+		pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
+	return rc;
+}
+EXPORT_SYMBOL(set_mce_nospec);
+
+/* Restore full speculative operation to the pfn. */
+int clear_mce_nospec(unsigned long pfn)
+{
+	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
+}
+EXPORT_SYMBOL(clear_mce_nospec);
+
+#endif
+
 int set_memory_x(unsigned long addr, int numpages)
 {
 	if (!(__supported_pte_mask & _PAGE_NX))
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index f36be5166c19..d6263d7afb55 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -42,20 +42,21 @@ static inline bool can_set_direct_map(void)
 #endif
 #endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
 
-#ifndef set_mce_nospec
+#ifdef CONFIG_X86_64
+int set_mce_nospec(unsigned long pfn, bool unmap);
+int clear_mce_nospec(unsigned long pfn);
+#else
 static inline int set_mce_nospec(unsigned long pfn, bool unmap)
 {
 	return 0;
 }
-#endif
-
-#ifndef clear_mce_nospec
 static inline int clear_mce_nospec(unsigned long pfn)
 {
 	return 0;
 }
 #endif
 
+
 #ifndef CONFIG_ARCH_HAS_MEM_ENCRYPT
 static inline int set_memory_encrypted(unsigned long addr, int numpages)
 {
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-19  6:28   ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Relocate the twin mce functions to arch/x86/mm/pat/set_memory.c
file where they belong.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 arch/x86/include/asm/set_memory.h | 52 -------------------------------
 arch/x86/mm/pat/set_memory.c      | 48 ++++++++++++++++++++++++++++
 include/linux/set_memory.h        |  9 +++---
 3 files changed, 53 insertions(+), 56 deletions(-)

diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index ff0f2d90338a..648be0bd20df 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -88,56 +88,4 @@ void notify_range_enc_status_changed(unsigned long vaddr, int npages, bool enc);
 
 extern int kernel_set_to_readonly;
 
-#ifdef CONFIG_X86_64
-/*
- * Prevent speculative access to the page by either unmapping
- * it (if we do not require access to any part of the page) or
- * marking it uncacheable (if we want to try to retrieve data
- * from non-poisoned lines in the page).
- */
-static inline int set_mce_nospec(unsigned long pfn, bool unmap)
-{
-	unsigned long decoy_addr;
-	int rc;
-
-	/* SGX pages are not in the 1:1 map */
-	if (arch_is_platform_page(pfn << PAGE_SHIFT))
-		return 0;
-	/*
-	 * We would like to just call:
-	 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
-	 * but doing that would radically increase the odds of a
-	 * speculative access to the poison page because we'd have
-	 * the virtual address of the kernel 1:1 mapping sitting
-	 * around in registers.
-	 * Instead we get tricky.  We create a non-canonical address
-	 * that looks just like the one we want, but has bit 63 flipped.
-	 * This relies on set_memory_XX() properly sanitizing any __pa()
-	 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
-	 */
-	decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
-
-	if (unmap)
-		rc = set_memory_np(decoy_addr, 1);
-	else
-		rc = set_memory_uc(decoy_addr, 1);
-	if (rc)
-		pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
-	return rc;
-}
-#define set_mce_nospec set_mce_nospec
-
-/* Restore full speculative operation to the pfn. */
-static inline int clear_mce_nospec(unsigned long pfn)
-{
-	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
-}
-#define clear_mce_nospec clear_mce_nospec
-#else
-/*
- * Few people would run a 32-bit kernel on a machine that supports
- * recoverable errors because they have too much memory to boot 32-bit.
- */
-#endif
-
 #endif /* _ASM_X86_SET_MEMORY_H */
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 042cfac6272b..9abc6077d768 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1925,6 +1925,54 @@ int set_memory_wb(unsigned long addr, int numpages)
 }
 EXPORT_SYMBOL(set_memory_wb);
 
+#ifdef CONFIG_X86_64
+/*
+ * Prevent speculative access to the page by either unmapping
+ * it (if we do not require access to any part of the page) or
+ * marking it uncacheable (if we want to try to retrieve data
+ * from non-poisoned lines in the page).
+ */
+int set_mce_nospec(unsigned long pfn, bool unmap)
+{
+	unsigned long decoy_addr;
+	int rc;
+
+	/* SGX pages are not in the 1:1 map */
+	if (arch_is_platform_page(pfn << PAGE_SHIFT))
+		return 0;
+	/*
+	 * We would like to just call:
+	 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
+	 * but doing that would radically increase the odds of a
+	 * speculative access to the poison page because we'd have
+	 * the virtual address of the kernel 1:1 mapping sitting
+	 * around in registers.
+	 * Instead we get tricky.  We create a non-canonical address
+	 * that looks just like the one we want, but has bit 63 flipped.
+	 * This relies on set_memory_XX() properly sanitizing any __pa()
+	 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
+	 */
+	decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
+
+	if (unmap)
+		rc = set_memory_np(decoy_addr, 1);
+	else
+		rc = set_memory_uc(decoy_addr, 1);
+	if (rc)
+		pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
+	return rc;
+}
+EXPORT_SYMBOL(set_mce_nospec);
+
+/* Restore full speculative operation to the pfn. */
+int clear_mce_nospec(unsigned long pfn)
+{
+	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
+}
+EXPORT_SYMBOL(clear_mce_nospec);
+
+#endif
+
 int set_memory_x(unsigned long addr, int numpages)
 {
 	if (!(__supported_pte_mask & _PAGE_NX))
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index f36be5166c19..d6263d7afb55 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -42,20 +42,21 @@ static inline bool can_set_direct_map(void)
 #endif
 #endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
 
-#ifndef set_mce_nospec
+#ifdef CONFIG_X86_64
+int set_mce_nospec(unsigned long pfn, bool unmap);
+int clear_mce_nospec(unsigned long pfn);
+#else
 static inline int set_mce_nospec(unsigned long pfn, bool unmap)
 {
 	return 0;
 }
-#endif
-
-#ifndef clear_mce_nospec
 static inline int clear_mce_nospec(unsigned long pfn)
 {
 	return 0;
 }
 #endif
 
+
 #ifndef CONFIG_ARCH_HAS_MEM_ENCRYPT
 static inline int set_memory_encrypted(unsigned long addr, int numpages)
 {
-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page
  2022-03-19  6:28 ` [dm-devel] " Jane Chu
@ 2022-03-19  6:28   ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Mark poisoned page as not present, and to reverse the 'np' effect,
restate the _PAGE_PRESENT bit. Please refer to discussions here for
reason behind the decision.
https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/

Now since poisoned page is marked as not-present, in order to
avoid writing to a 'np' page and trigger kernel Oops, also fix
pmem_do_write().

Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 arch/x86/kernel/cpu/mce/core.c |  6 +++---
 arch/x86/mm/pat/set_memory.c   | 21 +++++++++------------
 drivers/nvdimm/pmem.c          | 31 +++++++------------------------
 include/linux/set_memory.h     |  4 ++--
 4 files changed, 21 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 5818b837fd4d..8d12739f283d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -613,7 +613,7 @@ static int uc_decode_notifier(struct notifier_block *nb, unsigned long val,
 
 	pfn = mce->addr >> PAGE_SHIFT;
 	if (!memory_failure(pfn, 0)) {
-		set_mce_nospec(pfn, whole_page(mce));
+		set_mce_nospec(pfn);
 		mce->kflags |= MCE_HANDLED_UC;
 	}
 
@@ -1297,7 +1297,7 @@ static void kill_me_maybe(struct callback_head *cb)
 
 	ret = memory_failure(p->mce_addr >> PAGE_SHIFT, flags);
 	if (!ret) {
-		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
+		set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
 		sync_core();
 		return;
 	}
@@ -1321,7 +1321,7 @@ static void kill_me_never(struct callback_head *cb)
 	p->mce_count = 0;
 	pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr);
 	if (!memory_failure(p->mce_addr >> PAGE_SHIFT, 0))
-		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
+		set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
 }
 
 static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callback_head *))
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 9abc6077d768..747614c3dd83 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1925,14 +1925,14 @@ int set_memory_wb(unsigned long addr, int numpages)
 }
 EXPORT_SYMBOL(set_memory_wb);
 
+static int _set_memory_present(unsigned long addr, int numpages)
+{
+	return change_page_attr_set(&addr, numpages, __pgprot(_PAGE_PRESENT), 0);
+}
+
 #ifdef CONFIG_X86_64
-/*
- * Prevent speculative access to the page by either unmapping
- * it (if we do not require access to any part of the page) or
- * marking it uncacheable (if we want to try to retrieve data
- * from non-poisoned lines in the page).
- */
-int set_mce_nospec(unsigned long pfn, bool unmap)
+/* Prevent speculative access to a page by marking it not-present */
+int set_mce_nospec(unsigned long pfn)
 {
 	unsigned long decoy_addr;
 	int rc;
@@ -1954,10 +1954,7 @@ int set_mce_nospec(unsigned long pfn, bool unmap)
 	 */
 	decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
 
-	if (unmap)
-		rc = set_memory_np(decoy_addr, 1);
-	else
-		rc = set_memory_uc(decoy_addr, 1);
+	rc = set_memory_np(decoy_addr, 1);
 	if (rc)
 		pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
 	return rc;
@@ -1967,7 +1964,7 @@ EXPORT_SYMBOL(set_mce_nospec);
 /* Restore full speculative operation to the pfn. */
 int clear_mce_nospec(unsigned long pfn)
 {
-	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
+	return _set_memory_present((unsigned long) pfn_to_kaddr(pfn), 1);
 }
 EXPORT_SYMBOL(clear_mce_nospec);
 
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 58d95242a836..30c71a68175b 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -158,36 +158,19 @@ static blk_status_t pmem_do_write(struct pmem_device *pmem,
 			struct page *page, unsigned int page_off,
 			sector_t sector, unsigned int len)
 {
-	blk_status_t rc = BLK_STS_OK;
-	bool bad_pmem = false;
 	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
-	if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
-		bad_pmem = true;
+	if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
+		blk_status_t rc = pmem_clear_poison(pmem, pmem_off, len);
 
-	/*
-	 * Note that we write the data both before and after
-	 * clearing poison.  The write before clear poison
-	 * handles situations where the latest written data is
-	 * preserved and the clear poison operation simply marks
-	 * the address range as valid without changing the data.
-	 * In this case application software can assume that an
-	 * interrupted write will either return the new good
-	 * data or an error.
-	 *
-	 * However, if pmem_clear_poison() leaves the data in an
-	 * indeterminate state we need to perform the write
-	 * after clear poison.
-	 */
+		if (rc != BLK_STS_OK)
+			pr_warn_ratelimited("%s: failed to clear poison\n", __func__);
+			return rc;
+	}
 	flush_dcache_page(page);
 	write_pmem(pmem_addr, page, page_off, len);
-	if (unlikely(bad_pmem)) {
-		rc = pmem_clear_poison(pmem, pmem_off, len);
-		write_pmem(pmem_addr, page, page_off, len);
-	}
-
-	return rc;
+	return BLK_STS_OK;
 }
 
 static void pmem_submit_bio(struct bio *bio)
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index d6263d7afb55..cde2d8687a7b 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -43,10 +43,10 @@ static inline bool can_set_direct_map(void)
 #endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
 
 #ifdef CONFIG_X86_64
-int set_mce_nospec(unsigned long pfn, bool unmap);
+int set_mce_nospec(unsigned long pfn);
 int clear_mce_nospec(unsigned long pfn);
 #else
-static inline int set_mce_nospec(unsigned long pfn, bool unmap)
+static inline int set_mce_nospec(unsigned long pfn)
 {
 	return 0;
 }
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page
@ 2022-03-19  6:28   ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Mark poisoned page as not present, and to reverse the 'np' effect,
restate the _PAGE_PRESENT bit. Please refer to discussions here for
reason behind the decision.
https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/

Now since poisoned page is marked as not-present, in order to
avoid writing to a 'np' page and trigger kernel Oops, also fix
pmem_do_write().

Fixes: 284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 arch/x86/kernel/cpu/mce/core.c |  6 +++---
 arch/x86/mm/pat/set_memory.c   | 21 +++++++++------------
 drivers/nvdimm/pmem.c          | 31 +++++++------------------------
 include/linux/set_memory.h     |  4 ++--
 4 files changed, 21 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c
index 5818b837fd4d..8d12739f283d 100644
--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -613,7 +613,7 @@ static int uc_decode_notifier(struct notifier_block *nb, unsigned long val,
 
 	pfn = mce->addr >> PAGE_SHIFT;
 	if (!memory_failure(pfn, 0)) {
-		set_mce_nospec(pfn, whole_page(mce));
+		set_mce_nospec(pfn);
 		mce->kflags |= MCE_HANDLED_UC;
 	}
 
@@ -1297,7 +1297,7 @@ static void kill_me_maybe(struct callback_head *cb)
 
 	ret = memory_failure(p->mce_addr >> PAGE_SHIFT, flags);
 	if (!ret) {
-		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
+		set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
 		sync_core();
 		return;
 	}
@@ -1321,7 +1321,7 @@ static void kill_me_never(struct callback_head *cb)
 	p->mce_count = 0;
 	pr_err("Kernel accessed poison in user space at %llx\n", p->mce_addr);
 	if (!memory_failure(p->mce_addr >> PAGE_SHIFT, 0))
-		set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page);
+		set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
 }
 
 static void queue_task_work(struct mce *m, char *msg, void (*func)(struct callback_head *))
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 9abc6077d768..747614c3dd83 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -1925,14 +1925,14 @@ int set_memory_wb(unsigned long addr, int numpages)
 }
 EXPORT_SYMBOL(set_memory_wb);
 
+static int _set_memory_present(unsigned long addr, int numpages)
+{
+	return change_page_attr_set(&addr, numpages, __pgprot(_PAGE_PRESENT), 0);
+}
+
 #ifdef CONFIG_X86_64
-/*
- * Prevent speculative access to the page by either unmapping
- * it (if we do not require access to any part of the page) or
- * marking it uncacheable (if we want to try to retrieve data
- * from non-poisoned lines in the page).
- */
-int set_mce_nospec(unsigned long pfn, bool unmap)
+/* Prevent speculative access to a page by marking it not-present */
+int set_mce_nospec(unsigned long pfn)
 {
 	unsigned long decoy_addr;
 	int rc;
@@ -1954,10 +1954,7 @@ int set_mce_nospec(unsigned long pfn, bool unmap)
 	 */
 	decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
 
-	if (unmap)
-		rc = set_memory_np(decoy_addr, 1);
-	else
-		rc = set_memory_uc(decoy_addr, 1);
+	rc = set_memory_np(decoy_addr, 1);
 	if (rc)
 		pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
 	return rc;
@@ -1967,7 +1964,7 @@ EXPORT_SYMBOL(set_mce_nospec);
 /* Restore full speculative operation to the pfn. */
 int clear_mce_nospec(unsigned long pfn)
 {
-	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
+	return _set_memory_present((unsigned long) pfn_to_kaddr(pfn), 1);
 }
 EXPORT_SYMBOL(clear_mce_nospec);
 
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 58d95242a836..30c71a68175b 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -158,36 +158,19 @@ static blk_status_t pmem_do_write(struct pmem_device *pmem,
 			struct page *page, unsigned int page_off,
 			sector_t sector, unsigned int len)
 {
-	blk_status_t rc = BLK_STS_OK;
-	bool bad_pmem = false;
 	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
-	if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
-		bad_pmem = true;
+	if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
+		blk_status_t rc = pmem_clear_poison(pmem, pmem_off, len);
 
-	/*
-	 * Note that we write the data both before and after
-	 * clearing poison.  The write before clear poison
-	 * handles situations where the latest written data is
-	 * preserved and the clear poison operation simply marks
-	 * the address range as valid without changing the data.
-	 * In this case application software can assume that an
-	 * interrupted write will either return the new good
-	 * data or an error.
-	 *
-	 * However, if pmem_clear_poison() leaves the data in an
-	 * indeterminate state we need to perform the write
-	 * after clear poison.
-	 */
+		if (rc != BLK_STS_OK)
+			pr_warn_ratelimited("%s: failed to clear poison\n", __func__);
+			return rc;
+	}
 	flush_dcache_page(page);
 	write_pmem(pmem_addr, page, page_off, len);
-	if (unlikely(bad_pmem)) {
-		rc = pmem_clear_poison(pmem, pmem_off, len);
-		write_pmem(pmem_addr, page, page_off, len);
-	}
-
-	return rc;
+	return BLK_STS_OK;
 }
 
 static void pmem_submit_bio(struct bio *bio)
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index d6263d7afb55..cde2d8687a7b 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -43,10 +43,10 @@ static inline bool can_set_direct_map(void)
 #endif /* CONFIG_ARCH_HAS_SET_DIRECT_MAP */
 
 #ifdef CONFIG_X86_64
-int set_mce_nospec(unsigned long pfn, bool unmap);
+int set_mce_nospec(unsigned long pfn);
 int clear_mce_nospec(unsigned long pfn);
 #else
-static inline int set_mce_nospec(unsigned long pfn, bool unmap)
+static inline int set_mce_nospec(unsigned long pfn)
 {
 	return 0;
 }
-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-19  6:28 ` [dm-devel] " Jane Chu
@ 2022-03-19  6:28   ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is
not set by default in dax_direct_access() such that the helper
does not translate a pmem range to kernel virtual address if the
range contains uncorrectable errors.  When the flag is set,
the helper ignores the UEs and return kernel virtual adderss so
that the caller may get on with data recovery via write.

Also introduce a new dev_pagemap_ops .recovery_write function.
The function is applicable to FSDAX device only. The device
page backend driver provides .recovery_write function if the
device has underlying mechanism to clear the uncorrectable
errors on the fly.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/dax/super.c             | 23 +++++++++++++++++++++--
 drivers/md/dm-linear.c          |  4 ++--
 drivers/md/dm-log-writes.c      |  5 +++--
 drivers/md/dm-stripe.c          |  4 ++--
 drivers/md/dm-target.c          |  2 +-
 drivers/md/dm-writecache.c      |  5 +++--
 drivers/md/dm.c                 |  5 +++--
 drivers/nvdimm/pmem.c           | 27 +++++++++++++++++++++++----
 drivers/nvdimm/pmem.h           |  2 +-
 drivers/s390/block/dcssblk.c    |  4 ++--
 fs/dax.c                        | 32 ++++++++++++++++++++++++++------
 fs/fuse/dax.c                   |  4 ++--
 include/linux/dax.h             | 12 +++++++++---
 include/linux/device-mapper.h   |  2 +-
 include/linux/memremap.h        |  7 +++++++
 tools/testing/nvdimm/pmem-dax.c |  2 +-
 16 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e3029389d809..c0f0c8f980b1 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -28,6 +28,7 @@ struct dax_device {
 	void *private;
 	unsigned long flags;
 	const struct dax_operations *ops;
+	struct dev_pagemap *pgmap;
 };
 
 static dev_t dax_devt;
@@ -116,6 +117,7 @@ enum dax_device_flags {
  * @dax_dev: a dax_device instance representing the logical memory range
  * @pgoff: offset in pages from the start of the device to translate
  * @nr_pages: number of consecutive pages caller can handle relative to @pfn
+ * @flags: by default 0, set to DAX_RECOVERY to kick start dax recovery
  * @kaddr: output parameter that returns a virtual address mapping of pfn
  * @pfn: output parameter that returns an absolute pfn translation of @pgoff
  *
@@ -123,7 +125,7 @@ enum dax_device_flags {
  * pages accessible at the device relative @pgoff.
  */
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
-		void **kaddr, pfn_t *pfn)
+		int flags, void **kaddr, pfn_t *pfn)
 {
 	long avail;
 
@@ -136,7 +138,7 @@ long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
 	if (nr_pages < 0)
 		return -EINVAL;
 
-	avail = dax_dev->ops->direct_access(dax_dev, pgoff, nr_pages,
+	avail = dax_dev->ops->direct_access(dax_dev, pgoff, nr_pages, flags,
 			kaddr, pfn);
 	if (!avail)
 		return -ERANGE;
@@ -193,6 +195,18 @@ int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
 }
 EXPORT_SYMBOL_GPL(dax_zero_page_range);
 
+size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
+		void *addr, size_t bytes, struct iov_iter *iter)
+{
+	struct dev_pagemap *pgmap = dax_dev->pgmap;
+
+	if (!pgmap || !pgmap->ops->recovery_write)
+		return -EIO;
+	return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes,
+				(void *)iter);
+}
+EXPORT_SYMBOL_GPL(dax_recovery_write);
+
 #ifdef CONFIG_ARCH_HAS_PMEM_API
 void arch_wb_cache_pmem(void *addr, size_t size);
 void dax_flush(struct dax_device *dax_dev, void *addr, size_t size)
@@ -247,6 +261,11 @@ void set_dax_nomc(struct dax_device *dax_dev)
 	set_bit(DAXDEV_NOMC, &dax_dev->flags);
 }
 EXPORT_SYMBOL_GPL(set_dax_nomc);
+void set_dax_pgmap(struct dax_device *dax_dev, struct dev_pagemap *pgmap)
+{
+	dax_dev->pgmap = pgmap;
+}
+EXPORT_SYMBOL_GPL(set_dax_pgmap);
 
 bool dax_alive(struct dax_device *dax_dev)
 {
diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 1b97a11d7151..bfd8895317c0 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -173,11 +173,11 @@ static struct dax_device *linear_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
 }
 
 static long linear_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	struct dax_device *dax_dev = linear_dax_pgoff(ti, &pgoff);
 
-	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+	return dax_direct_access(dax_dev, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static int linear_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index 139b09b06eda..8ee8a9f5b161 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -912,11 +912,12 @@ static struct dax_device *log_writes_dax_pgoff(struct dm_target *ti,
 }
 
 static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-					 long nr_pages, void **kaddr, pfn_t *pfn)
+					 long nr_pages, int flags,
+					 void **kaddr, pfn_t *pfn)
 {
 	struct dax_device *dax_dev = log_writes_dax_pgoff(ti, &pgoff);
 
-	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+	return dax_direct_access(dax_dev, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static int log_writes_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index e566115ec0bb..af043d93ef53 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -317,11 +317,11 @@ static struct dax_device *stripe_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
 }
 
 static long stripe_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	struct dax_device *dax_dev = stripe_dax_pgoff(ti, &pgoff);
 
-	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+	return dax_direct_access(dax_dev, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static int stripe_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c
index 64dd0b34fcf4..24b1e5628f3a 100644
--- a/drivers/md/dm-target.c
+++ b/drivers/md/dm-target.c
@@ -142,7 +142,7 @@ static void io_err_release_clone_rq(struct request *clone,
 }
 
 static long io_err_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	return -EIO;
 }
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 4f31591d2d25..58af379107ec 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -286,7 +286,8 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 
 	id = dax_read_lock();
 
-	da = dax_direct_access(wc->ssd_dev->dax_dev, offset, p, &wc->memory_map, &pfn);
+	da = dax_direct_access(wc->ssd_dev->dax_dev, offset, p, 0,
+			&wc->memory_map, &pfn);
 	if (da < 0) {
 		wc->memory_map = NULL;
 		r = da;
@@ -309,7 +310,7 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 		do {
 			long daa;
 			daa = dax_direct_access(wc->ssd_dev->dax_dev, offset + i, p - i,
-						NULL, &pfn);
+						0, NULL, &pfn);
 			if (daa <= 0) {
 				r = daa ? daa : -EINVAL;
 				goto err3;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 997ace47bbd5..6b6509bfaa1a 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1001,7 +1001,8 @@ static struct dm_target *dm_dax_get_live_target(struct mapped_device *md,
 }
 
 static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-				 long nr_pages, void **kaddr, pfn_t *pfn)
+				 long nr_pages, int flags, void **kaddr,
+				 pfn_t *pfn)
 {
 	struct mapped_device *md = dax_get_private(dax_dev);
 	sector_t sector = pgoff * PAGE_SECTORS;
@@ -1019,7 +1020,7 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
 	if (len < 1)
 		goto out;
 	nr_pages = min(len, nr_pages);
-	ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
+	ret = ti->type->direct_access(ti, pgoff, nr_pages, flags, kaddr, pfn);
 
  out:
 	dm_put_live_table(md, srcu_idx);
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 30c71a68175b..7cdaa279beca 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -238,11 +238,11 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector,
 
 /* see "strong" declaration in tools/testing/nvdimm/pmem-dax.c */
 __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;
 
-	if (unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
+	if (!flags && unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
 					PFN_PHYS(nr_pages))))
 		return -EIO;
 
@@ -277,11 +277,12 @@ static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
 }
 
 static long pmem_dax_direct_access(struct dax_device *dax_dev,
-		pgoff_t pgoff, long nr_pages, void **kaddr, pfn_t *pfn)
+		pgoff_t pgoff, long nr_pages, int flags, void **kaddr,
+		pfn_t *pfn)
 {
 	struct pmem_device *pmem = dax_get_private(dax_dev);
 
-	return __pmem_direct_access(pmem, pgoff, nr_pages, kaddr, pfn);
+	return __pmem_direct_access(pmem, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static const struct dax_operations pmem_dax_ops = {
@@ -289,6 +290,17 @@ static const struct dax_operations pmem_dax_ops = {
 	.zero_page_range = pmem_dax_zero_page_range,
 };
 
+static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
+		void *addr, size_t bytes, void *iter)
+{
+	struct pmem_device *pmem = pgmap->owner;
+
+	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
+
+	/* XXX more later */
+	return 0;
+}
+
 static ssize_t write_cache_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -349,6 +361,10 @@ static void pmem_release_disk(void *__pmem)
 	blk_cleanup_disk(pmem->disk);
 }
 
+static const struct dev_pagemap_ops pmem_pgmap_ops = {
+	.recovery_write = pmem_recovery_write,
+};
+
 static int pmem_attach_disk(struct device *dev,
 		struct nd_namespace_common *ndns)
 {
@@ -380,6 +396,8 @@ static int pmem_attach_disk(struct device *dev,
 		rc = nvdimm_setup_pfn(nd_pfn, &pmem->pgmap);
 		if (rc)
 			return rc;
+		if (nd_pfn->mode == PFN_MODE_PMEM)
+			pmem->pgmap.ops = &pmem_pgmap_ops;
 	}
 
 	/* we're attaching a block device, disable raw namespace access */
@@ -464,6 +482,7 @@ static int pmem_attach_disk(struct device *dev,
 	}
 	set_dax_nocache(dax_dev);
 	set_dax_nomc(dax_dev);
+	set_dax_pgmap(dax_dev, &pmem->pgmap);
 	if (is_nvdimm_sync(nd_region))
 		set_dax_synchronous(dax_dev);
 	rc = dax_add_host(dax_dev, disk);
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 59cfe13ea8a8..823eeffa7298 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -27,7 +27,7 @@ struct pmem_device {
 };
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn);
+		long nr_pages, int flag, void **kaddr, pfn_t *pfn);
 
 #ifdef CONFIG_MEMORY_FAILURE
 static inline bool test_and_clear_pmem_poison(struct page *page)
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index d614843caf6c..c3fbf500868f 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -32,7 +32,7 @@ static int dcssblk_open(struct block_device *bdev, fmode_t mode);
 static void dcssblk_release(struct gendisk *disk, fmode_t mode);
 static void dcssblk_submit_bio(struct bio *bio);
 static long dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn);
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn);
 
 static char dcssblk_segments[DCSSBLK_PARM_LEN] = "\0";
 
@@ -927,7 +927,7 @@ __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff,
 
 static long
 dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	struct dcssblk_dev_info *dev_info = dax_get_private(dax_dev);
 
diff --git a/fs/dax.c b/fs/dax.c
index cd03485867a7..9f7efd0a1dd9 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -722,7 +722,7 @@ static int copy_cow_page_dax(struct vm_fault *vmf, const struct iomap_iter *iter
 	int id;
 
 	id = dax_read_lock();
-	rc = dax_direct_access(iter->iomap.dax_dev, pgoff, 1, &kaddr, NULL);
+	rc = dax_direct_access(iter->iomap.dax_dev, pgoff, 1, 0, &kaddr, NULL);
 	if (rc < 0) {
 		dax_read_unlock(id);
 		return rc;
@@ -1013,7 +1013,7 @@ static int dax_iomap_pfn(const struct iomap *iomap, loff_t pos, size_t size,
 	long length;
 
 	id = dax_read_lock();
-	length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size),
+	length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size), 0,
 				   NULL, pfnp);
 	if (length < 0) {
 		rc = length;
@@ -1123,7 +1123,7 @@ static int dax_memzero(struct dax_device *dax_dev, pgoff_t pgoff,
 	void *kaddr;
 	long ret;
 
-	ret = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
+	ret = dax_direct_access(dax_dev, pgoff, 1, 0, &kaddr, NULL);
 	if (ret > 0) {
 		memset(kaddr + offset, 0, size);
 		dax_flush(dax_dev, kaddr + offset, size);
@@ -1240,15 +1240,27 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
 		const size_t size = ALIGN(length + offset, PAGE_SIZE);
 		pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
 		ssize_t map_len;
+		int flags, recov;
 		void *kaddr;
+		long nrpg;
 
 		if (fatal_signal_pending(current)) {
 			ret = -EINTR;
 			break;
 		}
 
-		map_len = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size),
-				&kaddr, NULL);
+		recov = 0;
+		flags = 0;
+		nrpg = PHYS_PFN(size);
+		map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags,
+					&kaddr, NULL);
+		if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
+			flags |= DAX_RECOVERY;
+			map_len = dax_direct_access(dax_dev, pgoff, nrpg,
+						flags, &kaddr, NULL);
+			if (map_len > 0)
+				recov++;
+		}
 		if (map_len < 0) {
 			ret = map_len;
 			break;
@@ -1260,7 +1272,10 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
 		if (map_len > end - pos)
 			map_len = end - pos;
 
-		if (iov_iter_rw(iter) == WRITE)
+		if (recov)
+			xfer = dax_recovery_write(dax_dev, pgoff, kaddr,
+					map_len, iter);
+		else if (iov_iter_rw(iter) == WRITE)
 			xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,
 					map_len, iter);
 		else
@@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
 		length -= xfer;
 		done += xfer;
 
+		if (recov && (xfer == (ssize_t) -EIO)) {
+			pr_warn("dax_recovery_write failed\n");
+			ret = -EIO;
+			break;
+		}
 		if (xfer == 0)
 			ret = -EFAULT;
 		if (xfer < map_len)
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index 182b24a14804..e22c3ca9fdce 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -1241,8 +1241,8 @@ static int fuse_dax_mem_range_init(struct fuse_conn_dax *fcd)
 	INIT_DELAYED_WORK(&fcd->free_work, fuse_dax_free_mem_worker);
 
 	id = dax_read_lock();
-	nr_pages = dax_direct_access(fcd->dev, 0, PHYS_PFN(dax_size), NULL,
-				     NULL);
+	nr_pages = dax_direct_access(fcd->dev, 0, PHYS_PFN(dax_size), 0,
+				NULL, NULL);
 	dax_read_unlock(id);
 	if (nr_pages < 0) {
 		pr_debug("dax_direct_access() returned %ld\n", nr_pages);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 9fc5f99a0ae2..50bf4b0fb9b6 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -14,14 +14,17 @@ struct iomap_ops;
 struct iomap_iter;
 struct iomap;
 
+/* Flag to communicate for DAX recovery operation */
+#define	DAX_RECOVERY	0x1
+
 struct dax_operations {
 	/*
 	 * direct_access: translate a device-relative
 	 * logical-page-offset into an absolute physical pfn. Return the
 	 * number of pages available for DAX at that pfn.
 	 */
-	long (*direct_access)(struct dax_device *, pgoff_t, long,
-			void **, pfn_t *);
+	long (*direct_access)(struct dax_device *dax_dev, pgoff_t pgoff,
+			long nr_pages, int flags, void **kaddr, pfn_t *pfn);
 	/*
 	 * Validate whether this device is usable as an fsdax backing
 	 * device.
@@ -40,6 +43,8 @@ void dax_write_cache(struct dax_device *dax_dev, bool wc);
 bool dax_write_cache_enabled(struct dax_device *dax_dev);
 bool dax_synchronous(struct dax_device *dax_dev);
 void set_dax_synchronous(struct dax_device *dax_dev);
+size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
+		size_t bytes, struct iov_iter *i);
 /*
  * Check if given mapping is supported by the file / underlying device.
  */
@@ -91,6 +96,7 @@ static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
 
 void set_dax_nocache(struct dax_device *dax_dev);
 void set_dax_nomc(struct dax_device *dax_dev);
+void set_dax_pgmap(struct dax_device *dax_dev, struct dev_pagemap *pgmap);
 
 struct writeback_control;
 #if defined(CONFIG_BLOCK) && defined(CONFIG_FS_DAX)
@@ -178,7 +184,7 @@ static inline void dax_read_unlock(int id)
 bool dax_alive(struct dax_device *dax_dev);
 void *dax_get_private(struct dax_device *dax_dev);
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
-		void **kaddr, pfn_t *pfn);
+		int flags, void **kaddr, pfn_t *pfn);
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i);
 size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index b26fecf6c8e8..8cdfd7566a38 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -146,7 +146,7 @@ typedef int (*dm_busy_fn) (struct dm_target *ti);
  * >= 0 : the number of bytes accessible at the address
  */
 typedef long (*dm_dax_direct_access_fn) (struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn);
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn);
 typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
 		size_t nr_pages);
 
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 1fafcc38acba..a495e3c4c5fc 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -77,6 +77,13 @@ struct dev_pagemap_ops {
 	 * the page back to a CPU accessible page.
 	 */
 	vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
+
+	/*
+	 * Used for FS DAX device only. For synchronous recovery from DAX media
+	 * encountering Uncorrectable Error.
+	 */
+	size_t (*recovery_write)(struct dev_pagemap *pgmap, pgoff_t pgoff,
+			void *addr, size_t bytes, void *iter);
 };
 
 #define PGMAP_ALTMAP_VALID	(1 << 0)
diff --git a/tools/testing/nvdimm/pmem-dax.c b/tools/testing/nvdimm/pmem-dax.c
index af19c85558e7..287db5e3e293 100644
--- a/tools/testing/nvdimm/pmem-dax.c
+++ b/tools/testing/nvdimm/pmem-dax.c
@@ -8,7 +8,7 @@
 #include <nd.h>
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;
 
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-19  6:28   ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is
not set by default in dax_direct_access() such that the helper
does not translate a pmem range to kernel virtual address if the
range contains uncorrectable errors.  When the flag is set,
the helper ignores the UEs and return kernel virtual adderss so
that the caller may get on with data recovery via write.

Also introduce a new dev_pagemap_ops .recovery_write function.
The function is applicable to FSDAX device only. The device
page backend driver provides .recovery_write function if the
device has underlying mechanism to clear the uncorrectable
errors on the fly.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/dax/super.c             | 23 +++++++++++++++++++++--
 drivers/md/dm-linear.c          |  4 ++--
 drivers/md/dm-log-writes.c      |  5 +++--
 drivers/md/dm-stripe.c          |  4 ++--
 drivers/md/dm-target.c          |  2 +-
 drivers/md/dm-writecache.c      |  5 +++--
 drivers/md/dm.c                 |  5 +++--
 drivers/nvdimm/pmem.c           | 27 +++++++++++++++++++++++----
 drivers/nvdimm/pmem.h           |  2 +-
 drivers/s390/block/dcssblk.c    |  4 ++--
 fs/dax.c                        | 32 ++++++++++++++++++++++++++------
 fs/fuse/dax.c                   |  4 ++--
 include/linux/dax.h             | 12 +++++++++---
 include/linux/device-mapper.h   |  2 +-
 include/linux/memremap.h        |  7 +++++++
 tools/testing/nvdimm/pmem-dax.c |  2 +-
 16 files changed, 107 insertions(+), 33 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index e3029389d809..c0f0c8f980b1 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -28,6 +28,7 @@ struct dax_device {
 	void *private;
 	unsigned long flags;
 	const struct dax_operations *ops;
+	struct dev_pagemap *pgmap;
 };
 
 static dev_t dax_devt;
@@ -116,6 +117,7 @@ enum dax_device_flags {
  * @dax_dev: a dax_device instance representing the logical memory range
  * @pgoff: offset in pages from the start of the device to translate
  * @nr_pages: number of consecutive pages caller can handle relative to @pfn
+ * @flags: by default 0, set to DAX_RECOVERY to kick start dax recovery
  * @kaddr: output parameter that returns a virtual address mapping of pfn
  * @pfn: output parameter that returns an absolute pfn translation of @pgoff
  *
@@ -123,7 +125,7 @@ enum dax_device_flags {
  * pages accessible at the device relative @pgoff.
  */
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
-		void **kaddr, pfn_t *pfn)
+		int flags, void **kaddr, pfn_t *pfn)
 {
 	long avail;
 
@@ -136,7 +138,7 @@ long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
 	if (nr_pages < 0)
 		return -EINVAL;
 
-	avail = dax_dev->ops->direct_access(dax_dev, pgoff, nr_pages,
+	avail = dax_dev->ops->direct_access(dax_dev, pgoff, nr_pages, flags,
 			kaddr, pfn);
 	if (!avail)
 		return -ERANGE;
@@ -193,6 +195,18 @@ int dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
 }
 EXPORT_SYMBOL_GPL(dax_zero_page_range);
 
+size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
+		void *addr, size_t bytes, struct iov_iter *iter)
+{
+	struct dev_pagemap *pgmap = dax_dev->pgmap;
+
+	if (!pgmap || !pgmap->ops->recovery_write)
+		return -EIO;
+	return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes,
+				(void *)iter);
+}
+EXPORT_SYMBOL_GPL(dax_recovery_write);
+
 #ifdef CONFIG_ARCH_HAS_PMEM_API
 void arch_wb_cache_pmem(void *addr, size_t size);
 void dax_flush(struct dax_device *dax_dev, void *addr, size_t size)
@@ -247,6 +261,11 @@ void set_dax_nomc(struct dax_device *dax_dev)
 	set_bit(DAXDEV_NOMC, &dax_dev->flags);
 }
 EXPORT_SYMBOL_GPL(set_dax_nomc);
+void set_dax_pgmap(struct dax_device *dax_dev, struct dev_pagemap *pgmap)
+{
+	dax_dev->pgmap = pgmap;
+}
+EXPORT_SYMBOL_GPL(set_dax_pgmap);
 
 bool dax_alive(struct dax_device *dax_dev)
 {
diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c
index 1b97a11d7151..bfd8895317c0 100644
--- a/drivers/md/dm-linear.c
+++ b/drivers/md/dm-linear.c
@@ -173,11 +173,11 @@ static struct dax_device *linear_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
 }
 
 static long linear_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	struct dax_device *dax_dev = linear_dax_pgoff(ti, &pgoff);
 
-	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+	return dax_direct_access(dax_dev, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static int linear_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c
index 139b09b06eda..8ee8a9f5b161 100644
--- a/drivers/md/dm-log-writes.c
+++ b/drivers/md/dm-log-writes.c
@@ -912,11 +912,12 @@ static struct dax_device *log_writes_dax_pgoff(struct dm_target *ti,
 }
 
 static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-					 long nr_pages, void **kaddr, pfn_t *pfn)
+					 long nr_pages, int flags,
+					 void **kaddr, pfn_t *pfn)
 {
 	struct dax_device *dax_dev = log_writes_dax_pgoff(ti, &pgoff);
 
-	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+	return dax_direct_access(dax_dev, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static int log_writes_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index e566115ec0bb..af043d93ef53 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -317,11 +317,11 @@ static struct dax_device *stripe_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
 }
 
 static long stripe_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	struct dax_device *dax_dev = stripe_dax_pgoff(ti, &pgoff);
 
-	return dax_direct_access(dax_dev, pgoff, nr_pages, kaddr, pfn);
+	return dax_direct_access(dax_dev, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static int stripe_dax_zero_page_range(struct dm_target *ti, pgoff_t pgoff,
diff --git a/drivers/md/dm-target.c b/drivers/md/dm-target.c
index 64dd0b34fcf4..24b1e5628f3a 100644
--- a/drivers/md/dm-target.c
+++ b/drivers/md/dm-target.c
@@ -142,7 +142,7 @@ static void io_err_release_clone_rq(struct request *clone,
 }
 
 static long io_err_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	return -EIO;
 }
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index 4f31591d2d25..58af379107ec 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -286,7 +286,8 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 
 	id = dax_read_lock();
 
-	da = dax_direct_access(wc->ssd_dev->dax_dev, offset, p, &wc->memory_map, &pfn);
+	da = dax_direct_access(wc->ssd_dev->dax_dev, offset, p, 0,
+			&wc->memory_map, &pfn);
 	if (da < 0) {
 		wc->memory_map = NULL;
 		r = da;
@@ -309,7 +310,7 @@ static int persistent_memory_claim(struct dm_writecache *wc)
 		do {
 			long daa;
 			daa = dax_direct_access(wc->ssd_dev->dax_dev, offset + i, p - i,
-						NULL, &pfn);
+						0, NULL, &pfn);
 			if (daa <= 0) {
 				r = daa ? daa : -EINVAL;
 				goto err3;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 997ace47bbd5..6b6509bfaa1a 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1001,7 +1001,8 @@ static struct dm_target *dm_dax_get_live_target(struct mapped_device *md,
 }
 
 static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-				 long nr_pages, void **kaddr, pfn_t *pfn)
+				 long nr_pages, int flags, void **kaddr,
+				 pfn_t *pfn)
 {
 	struct mapped_device *md = dax_get_private(dax_dev);
 	sector_t sector = pgoff * PAGE_SECTORS;
@@ -1019,7 +1020,7 @@ static long dm_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
 	if (len < 1)
 		goto out;
 	nr_pages = min(len, nr_pages);
-	ret = ti->type->direct_access(ti, pgoff, nr_pages, kaddr, pfn);
+	ret = ti->type->direct_access(ti, pgoff, nr_pages, flags, kaddr, pfn);
 
  out:
 	dm_put_live_table(md, srcu_idx);
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 30c71a68175b..7cdaa279beca 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -238,11 +238,11 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector,
 
 /* see "strong" declaration in tools/testing/nvdimm/pmem-dax.c */
 __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;
 
-	if (unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
+	if (!flags && unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
 					PFN_PHYS(nr_pages))))
 		return -EIO;
 
@@ -277,11 +277,12 @@ static int pmem_dax_zero_page_range(struct dax_device *dax_dev, pgoff_t pgoff,
 }
 
 static long pmem_dax_direct_access(struct dax_device *dax_dev,
-		pgoff_t pgoff, long nr_pages, void **kaddr, pfn_t *pfn)
+		pgoff_t pgoff, long nr_pages, int flags, void **kaddr,
+		pfn_t *pfn)
 {
 	struct pmem_device *pmem = dax_get_private(dax_dev);
 
-	return __pmem_direct_access(pmem, pgoff, nr_pages, kaddr, pfn);
+	return __pmem_direct_access(pmem, pgoff, nr_pages, flags, kaddr, pfn);
 }
 
 static const struct dax_operations pmem_dax_ops = {
@@ -289,6 +290,17 @@ static const struct dax_operations pmem_dax_ops = {
 	.zero_page_range = pmem_dax_zero_page_range,
 };
 
+static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
+		void *addr, size_t bytes, void *iter)
+{
+	struct pmem_device *pmem = pgmap->owner;
+
+	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
+
+	/* XXX more later */
+	return 0;
+}
+
 static ssize_t write_cache_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
@@ -349,6 +361,10 @@ static void pmem_release_disk(void *__pmem)
 	blk_cleanup_disk(pmem->disk);
 }
 
+static const struct dev_pagemap_ops pmem_pgmap_ops = {
+	.recovery_write = pmem_recovery_write,
+};
+
 static int pmem_attach_disk(struct device *dev,
 		struct nd_namespace_common *ndns)
 {
@@ -380,6 +396,8 @@ static int pmem_attach_disk(struct device *dev,
 		rc = nvdimm_setup_pfn(nd_pfn, &pmem->pgmap);
 		if (rc)
 			return rc;
+		if (nd_pfn->mode == PFN_MODE_PMEM)
+			pmem->pgmap.ops = &pmem_pgmap_ops;
 	}
 
 	/* we're attaching a block device, disable raw namespace access */
@@ -464,6 +482,7 @@ static int pmem_attach_disk(struct device *dev,
 	}
 	set_dax_nocache(dax_dev);
 	set_dax_nomc(dax_dev);
+	set_dax_pgmap(dax_dev, &pmem->pgmap);
 	if (is_nvdimm_sync(nd_region))
 		set_dax_synchronous(dax_dev);
 	rc = dax_add_host(dax_dev, disk);
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 59cfe13ea8a8..823eeffa7298 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -27,7 +27,7 @@ struct pmem_device {
 };
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn);
+		long nr_pages, int flag, void **kaddr, pfn_t *pfn);
 
 #ifdef CONFIG_MEMORY_FAILURE
 static inline bool test_and_clear_pmem_poison(struct page *page)
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index d614843caf6c..c3fbf500868f 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -32,7 +32,7 @@ static int dcssblk_open(struct block_device *bdev, fmode_t mode);
 static void dcssblk_release(struct gendisk *disk, fmode_t mode);
 static void dcssblk_submit_bio(struct bio *bio);
 static long dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn);
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn);
 
 static char dcssblk_segments[DCSSBLK_PARM_LEN] = "\0";
 
@@ -927,7 +927,7 @@ __dcssblk_direct_access(struct dcssblk_dev_info *dev_info, pgoff_t pgoff,
 
 static long
 dcssblk_dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	struct dcssblk_dev_info *dev_info = dax_get_private(dax_dev);
 
diff --git a/fs/dax.c b/fs/dax.c
index cd03485867a7..9f7efd0a1dd9 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -722,7 +722,7 @@ static int copy_cow_page_dax(struct vm_fault *vmf, const struct iomap_iter *iter
 	int id;
 
 	id = dax_read_lock();
-	rc = dax_direct_access(iter->iomap.dax_dev, pgoff, 1, &kaddr, NULL);
+	rc = dax_direct_access(iter->iomap.dax_dev, pgoff, 1, 0, &kaddr, NULL);
 	if (rc < 0) {
 		dax_read_unlock(id);
 		return rc;
@@ -1013,7 +1013,7 @@ static int dax_iomap_pfn(const struct iomap *iomap, loff_t pos, size_t size,
 	long length;
 
 	id = dax_read_lock();
-	length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size),
+	length = dax_direct_access(iomap->dax_dev, pgoff, PHYS_PFN(size), 0,
 				   NULL, pfnp);
 	if (length < 0) {
 		rc = length;
@@ -1123,7 +1123,7 @@ static int dax_memzero(struct dax_device *dax_dev, pgoff_t pgoff,
 	void *kaddr;
 	long ret;
 
-	ret = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
+	ret = dax_direct_access(dax_dev, pgoff, 1, 0, &kaddr, NULL);
 	if (ret > 0) {
 		memset(kaddr + offset, 0, size);
 		dax_flush(dax_dev, kaddr + offset, size);
@@ -1240,15 +1240,27 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
 		const size_t size = ALIGN(length + offset, PAGE_SIZE);
 		pgoff_t pgoff = dax_iomap_pgoff(iomap, pos);
 		ssize_t map_len;
+		int flags, recov;
 		void *kaddr;
+		long nrpg;
 
 		if (fatal_signal_pending(current)) {
 			ret = -EINTR;
 			break;
 		}
 
-		map_len = dax_direct_access(dax_dev, pgoff, PHYS_PFN(size),
-				&kaddr, NULL);
+		recov = 0;
+		flags = 0;
+		nrpg = PHYS_PFN(size);
+		map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags,
+					&kaddr, NULL);
+		if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
+			flags |= DAX_RECOVERY;
+			map_len = dax_direct_access(dax_dev, pgoff, nrpg,
+						flags, &kaddr, NULL);
+			if (map_len > 0)
+				recov++;
+		}
 		if (map_len < 0) {
 			ret = map_len;
 			break;
@@ -1260,7 +1272,10 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
 		if (map_len > end - pos)
 			map_len = end - pos;
 
-		if (iov_iter_rw(iter) == WRITE)
+		if (recov)
+			xfer = dax_recovery_write(dax_dev, pgoff, kaddr,
+					map_len, iter);
+		else if (iov_iter_rw(iter) == WRITE)
 			xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,
 					map_len, iter);
 		else
@@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
 		length -= xfer;
 		done += xfer;
 
+		if (recov && (xfer == (ssize_t) -EIO)) {
+			pr_warn("dax_recovery_write failed\n");
+			ret = -EIO;
+			break;
+		}
 		if (xfer == 0)
 			ret = -EFAULT;
 		if (xfer < map_len)
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c
index 182b24a14804..e22c3ca9fdce 100644
--- a/fs/fuse/dax.c
+++ b/fs/fuse/dax.c
@@ -1241,8 +1241,8 @@ static int fuse_dax_mem_range_init(struct fuse_conn_dax *fcd)
 	INIT_DELAYED_WORK(&fcd->free_work, fuse_dax_free_mem_worker);
 
 	id = dax_read_lock();
-	nr_pages = dax_direct_access(fcd->dev, 0, PHYS_PFN(dax_size), NULL,
-				     NULL);
+	nr_pages = dax_direct_access(fcd->dev, 0, PHYS_PFN(dax_size), 0,
+				NULL, NULL);
 	dax_read_unlock(id);
 	if (nr_pages < 0) {
 		pr_debug("dax_direct_access() returned %ld\n", nr_pages);
diff --git a/include/linux/dax.h b/include/linux/dax.h
index 9fc5f99a0ae2..50bf4b0fb9b6 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -14,14 +14,17 @@ struct iomap_ops;
 struct iomap_iter;
 struct iomap;
 
+/* Flag to communicate for DAX recovery operation */
+#define	DAX_RECOVERY	0x1
+
 struct dax_operations {
 	/*
 	 * direct_access: translate a device-relative
 	 * logical-page-offset into an absolute physical pfn. Return the
 	 * number of pages available for DAX at that pfn.
 	 */
-	long (*direct_access)(struct dax_device *, pgoff_t, long,
-			void **, pfn_t *);
+	long (*direct_access)(struct dax_device *dax_dev, pgoff_t pgoff,
+			long nr_pages, int flags, void **kaddr, pfn_t *pfn);
 	/*
 	 * Validate whether this device is usable as an fsdax backing
 	 * device.
@@ -40,6 +43,8 @@ void dax_write_cache(struct dax_device *dax_dev, bool wc);
 bool dax_write_cache_enabled(struct dax_device *dax_dev);
 bool dax_synchronous(struct dax_device *dax_dev);
 void set_dax_synchronous(struct dax_device *dax_dev);
+size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
+		size_t bytes, struct iov_iter *i);
 /*
  * Check if given mapping is supported by the file / underlying device.
  */
@@ -91,6 +96,7 @@ static inline bool daxdev_mapping_supported(struct vm_area_struct *vma,
 
 void set_dax_nocache(struct dax_device *dax_dev);
 void set_dax_nomc(struct dax_device *dax_dev);
+void set_dax_pgmap(struct dax_device *dax_dev, struct dev_pagemap *pgmap);
 
 struct writeback_control;
 #if defined(CONFIG_BLOCK) && defined(CONFIG_FS_DAX)
@@ -178,7 +184,7 @@ static inline void dax_read_unlock(int id)
 bool dax_alive(struct dax_device *dax_dev);
 void *dax_get_private(struct dax_device *dax_dev);
 long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
-		void **kaddr, pfn_t *pfn);
+		int flags, void **kaddr, pfn_t *pfn);
 size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
 		size_t bytes, struct iov_iter *i);
 size_t dax_copy_to_iter(struct dax_device *dax_dev, pgoff_t pgoff, void *addr,
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index b26fecf6c8e8..8cdfd7566a38 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -146,7 +146,7 @@ typedef int (*dm_busy_fn) (struct dm_target *ti);
  * >= 0 : the number of bytes accessible at the address
  */
 typedef long (*dm_dax_direct_access_fn) (struct dm_target *ti, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn);
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn);
 typedef int (*dm_dax_zero_page_range_fn)(struct dm_target *ti, pgoff_t pgoff,
 		size_t nr_pages);
 
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 1fafcc38acba..a495e3c4c5fc 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -77,6 +77,13 @@ struct dev_pagemap_ops {
 	 * the page back to a CPU accessible page.
 	 */
 	vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf);
+
+	/*
+	 * Used for FS DAX device only. For synchronous recovery from DAX media
+	 * encountering Uncorrectable Error.
+	 */
+	size_t (*recovery_write)(struct dev_pagemap *pgmap, pgoff_t pgoff,
+			void *addr, size_t bytes, void *iter);
 };
 
 #define PGMAP_ALTMAP_VALID	(1 << 0)
diff --git a/tools/testing/nvdimm/pmem-dax.c b/tools/testing/nvdimm/pmem-dax.c
index af19c85558e7..287db5e3e293 100644
--- a/tools/testing/nvdimm/pmem-dax.c
+++ b/tools/testing/nvdimm/pmem-dax.c
@@ -8,7 +8,7 @@
 #include <nd.h>
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
 {
 	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;
 
-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 5/6] pmem: refactor pmem_clear_poison()
  2022-03-19  6:28 ` [dm-devel] " Jane Chu
@ 2022-03-19  6:28   ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Refactor the pmem_clear_poison() in order to share common code
later.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/nvdimm/pmem.c | 81 ++++++++++++++++++++++++++++++-------------
 1 file changed, 56 insertions(+), 25 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 7cdaa279beca..18f357fbef69 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -45,10 +45,27 @@ static struct nd_region *to_region(struct pmem_device *pmem)
 	return to_nd_region(to_dev(pmem)->parent);
 }
 
-static void hwpoison_clear(struct pmem_device *pmem,
-		phys_addr_t phys, unsigned int len)
+static phys_addr_t to_phys(struct pmem_device *pmem, phys_addr_t offset)
 {
+	return pmem->phys_addr + offset;
+}
+
+static sector_t to_sect(struct pmem_device *pmem, phys_addr_t offset)
+{
+	return (offset - pmem->data_offset) / 512;
+}
+
+static phys_addr_t to_offset(struct pmem_device *pmem, sector_t sector)
+{
+	return sector * 512 + pmem->data_offset;
+}
+
+static void clear_hwpoison(struct pmem_device *pmem, phys_addr_t offset,
+		unsigned int len)
+{
+	phys_addr_t phys = to_phys(pmem, offset);
 	unsigned long pfn_start, pfn_end, pfn;
+	unsigned int blks = len / 512;
 
 	/* only pmem in the linear map supports HWPoison */
 	if (is_vmalloc_addr(pmem->virt_addr))
@@ -67,33 +84,47 @@ static void hwpoison_clear(struct pmem_device *pmem,
 		if (test_and_clear_pmem_poison(page))
 			clear_mce_nospec(pfn);
 	}
+
+	dev_dbg(to_dev(pmem), "%#llx clear %u sector%s\n",
+		(unsigned long long) to_sect(pmem, offset), blks,
+		blks > 1 ? "s" : "");
 }
 
-static blk_status_t pmem_clear_poison(struct pmem_device *pmem,
-		phys_addr_t offset, unsigned int len)
+static void clear_bb(struct pmem_device *pmem, sector_t sector, long blks)
 {
-	struct device *dev = to_dev(pmem);
-	sector_t sector;
+	badblocks_clear(&pmem->bb, sector, blks);
+	if (pmem->bb_state)
+		sysfs_notify_dirent(pmem->bb_state);
+}
+
+static blk_status_t __pmem_clear_poison(struct pmem_device *pmem,
+		phys_addr_t offset, unsigned int len,
+		unsigned int *blks)
+{
+	phys_addr_t phys = to_phys(pmem, offset);
 	long cleared;
-	blk_status_t rc = BLK_STS_OK;
-
-	sector = (offset - pmem->data_offset) / 512;
-
-	cleared = nvdimm_clear_poison(dev, pmem->phys_addr + offset, len);
-	if (cleared < len)
-		rc = BLK_STS_IOERR;
-	if (cleared > 0 && cleared / 512) {
-		hwpoison_clear(pmem, pmem->phys_addr + offset, cleared);
-		cleared /= 512;
-		dev_dbg(dev, "%#llx clear %ld sector%s\n",
-				(unsigned long long) sector, cleared,
-				cleared > 1 ? "s" : "");
-		badblocks_clear(&pmem->bb, sector, cleared);
-		if (pmem->bb_state)
-			sysfs_notify_dirent(pmem->bb_state);
-	}
+	blk_status_t rc;
 
+	cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
+	*blks = cleared / 512;
+	rc = (cleared < len) ? BLK_STS_IOERR : BLK_STS_OK;
+	if (cleared <= 0 || *blks == 0)
+		return rc;
+
+	clear_hwpoison(pmem, offset, cleared);
 	arch_invalidate_pmem(pmem->virt_addr + offset, len);
+	return rc;
+}
+
+static blk_status_t pmem_clear_poison(struct pmem_device *pmem,
+		phys_addr_t offset, unsigned int len)
+{
+	unsigned int blks;
+	blk_status_t rc;
+
+	rc = __pmem_clear_poison(pmem, offset, len, &blks);
+	if (blks > 0)
+		clear_bb(pmem, to_sect(pmem, offset), blks);
 
 	return rc;
 }
@@ -143,7 +174,7 @@ static blk_status_t pmem_do_read(struct pmem_device *pmem,
 			sector_t sector, unsigned int len)
 {
 	blk_status_t rc;
-	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
+	phys_addr_t pmem_off = to_offset(pmem, sector);
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
 	if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
@@ -158,7 +189,7 @@ static blk_status_t pmem_do_write(struct pmem_device *pmem,
 			struct page *page, unsigned int page_off,
 			sector_t sector, unsigned int len)
 {
-	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
+	phys_addr_t pmem_off = to_offset(pmem, sector);
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
 	if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 5/6] pmem: refactor pmem_clear_poison()
@ 2022-03-19  6:28   ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

Refactor the pmem_clear_poison() in order to share common code
later.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/nvdimm/pmem.c | 81 ++++++++++++++++++++++++++++++-------------
 1 file changed, 56 insertions(+), 25 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 7cdaa279beca..18f357fbef69 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -45,10 +45,27 @@ static struct nd_region *to_region(struct pmem_device *pmem)
 	return to_nd_region(to_dev(pmem)->parent);
 }
 
-static void hwpoison_clear(struct pmem_device *pmem,
-		phys_addr_t phys, unsigned int len)
+static phys_addr_t to_phys(struct pmem_device *pmem, phys_addr_t offset)
 {
+	return pmem->phys_addr + offset;
+}
+
+static sector_t to_sect(struct pmem_device *pmem, phys_addr_t offset)
+{
+	return (offset - pmem->data_offset) / 512;
+}
+
+static phys_addr_t to_offset(struct pmem_device *pmem, sector_t sector)
+{
+	return sector * 512 + pmem->data_offset;
+}
+
+static void clear_hwpoison(struct pmem_device *pmem, phys_addr_t offset,
+		unsigned int len)
+{
+	phys_addr_t phys = to_phys(pmem, offset);
 	unsigned long pfn_start, pfn_end, pfn;
+	unsigned int blks = len / 512;
 
 	/* only pmem in the linear map supports HWPoison */
 	if (is_vmalloc_addr(pmem->virt_addr))
@@ -67,33 +84,47 @@ static void hwpoison_clear(struct pmem_device *pmem,
 		if (test_and_clear_pmem_poison(page))
 			clear_mce_nospec(pfn);
 	}
+
+	dev_dbg(to_dev(pmem), "%#llx clear %u sector%s\n",
+		(unsigned long long) to_sect(pmem, offset), blks,
+		blks > 1 ? "s" : "");
 }
 
-static blk_status_t pmem_clear_poison(struct pmem_device *pmem,
-		phys_addr_t offset, unsigned int len)
+static void clear_bb(struct pmem_device *pmem, sector_t sector, long blks)
 {
-	struct device *dev = to_dev(pmem);
-	sector_t sector;
+	badblocks_clear(&pmem->bb, sector, blks);
+	if (pmem->bb_state)
+		sysfs_notify_dirent(pmem->bb_state);
+}
+
+static blk_status_t __pmem_clear_poison(struct pmem_device *pmem,
+		phys_addr_t offset, unsigned int len,
+		unsigned int *blks)
+{
+	phys_addr_t phys = to_phys(pmem, offset);
 	long cleared;
-	blk_status_t rc = BLK_STS_OK;
-
-	sector = (offset - pmem->data_offset) / 512;
-
-	cleared = nvdimm_clear_poison(dev, pmem->phys_addr + offset, len);
-	if (cleared < len)
-		rc = BLK_STS_IOERR;
-	if (cleared > 0 && cleared / 512) {
-		hwpoison_clear(pmem, pmem->phys_addr + offset, cleared);
-		cleared /= 512;
-		dev_dbg(dev, "%#llx clear %ld sector%s\n",
-				(unsigned long long) sector, cleared,
-				cleared > 1 ? "s" : "");
-		badblocks_clear(&pmem->bb, sector, cleared);
-		if (pmem->bb_state)
-			sysfs_notify_dirent(pmem->bb_state);
-	}
+	blk_status_t rc;
 
+	cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
+	*blks = cleared / 512;
+	rc = (cleared < len) ? BLK_STS_IOERR : BLK_STS_OK;
+	if (cleared <= 0 || *blks == 0)
+		return rc;
+
+	clear_hwpoison(pmem, offset, cleared);
 	arch_invalidate_pmem(pmem->virt_addr + offset, len);
+	return rc;
+}
+
+static blk_status_t pmem_clear_poison(struct pmem_device *pmem,
+		phys_addr_t offset, unsigned int len)
+{
+	unsigned int blks;
+	blk_status_t rc;
+
+	rc = __pmem_clear_poison(pmem, offset, len, &blks);
+	if (blks > 0)
+		clear_bb(pmem, to_sect(pmem, offset), blks);
 
 	return rc;
 }
@@ -143,7 +174,7 @@ static blk_status_t pmem_do_read(struct pmem_device *pmem,
 			sector_t sector, unsigned int len)
 {
 	blk_status_t rc;
-	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
+	phys_addr_t pmem_off = to_offset(pmem, sector);
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
 	if (unlikely(is_bad_pmem(&pmem->bb, sector, len)))
@@ -158,7 +189,7 @@ static blk_status_t pmem_do_write(struct pmem_device *pmem,
 			struct page *page, unsigned int page_off,
 			sector_t sector, unsigned int len)
 {
-	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
+	phys_addr_t pmem_off = to_offset(pmem, sector);
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
 	if (unlikely(is_bad_pmem(&pmem->bb, sector, len))) {
-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v6 6/6] pmem: implement pmem_recovery_write()
  2022-03-19  6:28 ` [dm-devel] " Jane Chu
@ 2022-03-19  6:28   ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

The recovery write thread started out as a normal pwrite thread and
when the filesystem was told about potential media error in the
range, filesystem turns the normal pwrite to a dax_recovery_write.

The recovery write consists of clearing media poison, clearing page
HWPoison bit, reenable page-wide read-write permission, flush the
caches and finally write.  A competing pread thread will be held
off during the recovery process since data read back might not be
valid, and this is achieved by clearing the badblock records after
the recovery write is complete. Competing recovery write threads
are serialized by pmem device level .recovery_lock.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/nvdimm/pmem.c | 49 ++++++++++++++++++++++++++++++++++++++++---
 drivers/nvdimm/pmem.h |  1 +
 2 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 18f357fbef69..c4fbd9426075 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -321,15 +321,57 @@ static const struct dax_operations pmem_dax_ops = {
 	.zero_page_range = pmem_dax_zero_page_range,
 };
 
+/*
+ * The recovery write thread started out as a normal pwrite thread and
+ * when the filesystem was told about potential media error in the
+ * range, filesystem turns the normal pwrite to a dax_recovery_write.
+ *
+ * The recovery write consists of clearing media poison, clearing page
+ * HWPoison bit, reenable page-wide read-write permission, flush the
+ * caches and finally write.  A competing pread thread will be held
+ * off during the recovery process since data read back might not be
+ * valid, and this is achieved by clearing the badblock records after
+ * the recovery write is complete. Competing recovery write threads
+ * are serialized by pmem device level .recovery_lock.
+ */
 static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
 		void *addr, size_t bytes, void *iter)
 {
 	struct pmem_device *pmem = pgmap->owner;
+	size_t olen, len, off;
+	phys_addr_t pmem_off;
+	struct device *dev = pmem->bb.dev;
+	unsigned int blks;
 
-	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
+	off = (unsigned long)addr & ~PAGE_MASK;
+	len = PFN_PHYS(PFN_UP(off + bytes));
+	if (!is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512, len))
+		return _copy_from_iter_flushcache(addr, bytes, iter);
 
-	/* XXX more later */
-	return 0;
+	/*
+	 * Not page-aligned range cannot be recovered. This should not
+	 * happen unless something else went wrong.
+	 */
+	if (off || !(PAGE_ALIGNED(bytes))) {
+		dev_warn(dev, "Found poison, but addr(%p) or bytes(%#lx) not page aligned\n",
+			addr, bytes);
+		return (size_t) -EIO;
+	}
+
+	mutex_lock(&pmem->recovery_lock);
+	pmem_off = PFN_PHYS(pgoff) + pmem->data_offset;
+	if (__pmem_clear_poison(pmem, pmem_off, len, &blks) != BLK_STS_OK) {
+		dev_warn(dev, "pmem dimm poison clearing failed\n");
+		mutex_unlock(&pmem->recovery_lock);
+		return (size_t) -EIO;
+	}
+
+	olen = _copy_from_iter_flushcache(addr, bytes, iter);
+	if (blks > 0)
+		clear_bb(pmem, to_sect(pmem, pmem_off), blks);
+
+	mutex_unlock(&pmem->recovery_lock);
+	return olen;
 }
 
 static ssize_t write_cache_show(struct device *dev,
@@ -520,6 +562,7 @@ static int pmem_attach_disk(struct device *dev,
 	if (rc)
 		goto out_cleanup_dax;
 	dax_write_cache(dax_dev, nvdimm_has_cache(nd_region));
+	mutex_init(&pmem->recovery_lock);
 	pmem->dax_dev = dax_dev;
 
 	rc = device_add_disk(dev, disk, pmem_attribute_groups);
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 823eeffa7298..bad7d1b05691 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -24,6 +24,7 @@ struct pmem_device {
 	struct dax_device	*dax_dev;
 	struct gendisk		*disk;
 	struct dev_pagemap	pgmap;
+	struct mutex		recovery_lock;
 };
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [dm-devel] [PATCH v6 6/6] pmem: implement pmem_recovery_write()
@ 2022-03-19  6:28   ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-19  6:28 UTC (permalink / raw)
  To: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

The recovery write thread started out as a normal pwrite thread and
when the filesystem was told about potential media error in the
range, filesystem turns the normal pwrite to a dax_recovery_write.

The recovery write consists of clearing media poison, clearing page
HWPoison bit, reenable page-wide read-write permission, flush the
caches and finally write.  A competing pread thread will be held
off during the recovery process since data read back might not be
valid, and this is achieved by clearing the badblock records after
the recovery write is complete. Competing recovery write threads
are serialized by pmem device level .recovery_lock.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 drivers/nvdimm/pmem.c | 49 ++++++++++++++++++++++++++++++++++++++++---
 drivers/nvdimm/pmem.h |  1 +
 2 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 18f357fbef69..c4fbd9426075 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -321,15 +321,57 @@ static const struct dax_operations pmem_dax_ops = {
 	.zero_page_range = pmem_dax_zero_page_range,
 };
 
+/*
+ * The recovery write thread started out as a normal pwrite thread and
+ * when the filesystem was told about potential media error in the
+ * range, filesystem turns the normal pwrite to a dax_recovery_write.
+ *
+ * The recovery write consists of clearing media poison, clearing page
+ * HWPoison bit, reenable page-wide read-write permission, flush the
+ * caches and finally write.  A competing pread thread will be held
+ * off during the recovery process since data read back might not be
+ * valid, and this is achieved by clearing the badblock records after
+ * the recovery write is complete. Competing recovery write threads
+ * are serialized by pmem device level .recovery_lock.
+ */
 static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
 		void *addr, size_t bytes, void *iter)
 {
 	struct pmem_device *pmem = pgmap->owner;
+	size_t olen, len, off;
+	phys_addr_t pmem_off;
+	struct device *dev = pmem->bb.dev;
+	unsigned int blks;
 
-	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
+	off = (unsigned long)addr & ~PAGE_MASK;
+	len = PFN_PHYS(PFN_UP(off + bytes));
+	if (!is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512, len))
+		return _copy_from_iter_flushcache(addr, bytes, iter);
 
-	/* XXX more later */
-	return 0;
+	/*
+	 * Not page-aligned range cannot be recovered. This should not
+	 * happen unless something else went wrong.
+	 */
+	if (off || !(PAGE_ALIGNED(bytes))) {
+		dev_warn(dev, "Found poison, but addr(%p) or bytes(%#lx) not page aligned\n",
+			addr, bytes);
+		return (size_t) -EIO;
+	}
+
+	mutex_lock(&pmem->recovery_lock);
+	pmem_off = PFN_PHYS(pgoff) + pmem->data_offset;
+	if (__pmem_clear_poison(pmem, pmem_off, len, &blks) != BLK_STS_OK) {
+		dev_warn(dev, "pmem dimm poison clearing failed\n");
+		mutex_unlock(&pmem->recovery_lock);
+		return (size_t) -EIO;
+	}
+
+	olen = _copy_from_iter_flushcache(addr, bytes, iter);
+	if (blks > 0)
+		clear_bb(pmem, to_sect(pmem, pmem_off), blks);
+
+	mutex_unlock(&pmem->recovery_lock);
+	return olen;
 }
 
 static ssize_t write_cache_show(struct device *dev,
@@ -520,6 +562,7 @@ static int pmem_attach_disk(struct device *dev,
 	if (rc)
 		goto out_cleanup_dax;
 	dax_write_cache(dax_dev, nvdimm_has_cache(nd_region));
+	mutex_init(&pmem->recovery_lock);
 	pmem->dax_dev = dax_dev;
 
 	rc = device_add_disk(dev, disk, pmem_attribute_groups);
diff --git a/drivers/nvdimm/pmem.h b/drivers/nvdimm/pmem.h
index 823eeffa7298..bad7d1b05691 100644
--- a/drivers/nvdimm/pmem.h
+++ b/drivers/nvdimm/pmem.h
@@ -24,6 +24,7 @@ struct pmem_device {
 	struct dax_device	*dax_dev;
 	struct gendisk		*disk;
 	struct dev_pagemap	pgmap;
+	struct mutex		recovery_lock;
 };
 
 long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-- 
2.18.4

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-19  8:13     ` kernel test robot
  -1 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:13 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: kbuild-all

Hi Jane,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nvdimm/libnvdimm-for-next]
[also build test WARNING on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: x86_64-randconfig-a015 (https://download.01.org/0day-ci/archive/20220319/202203191637.PK2oJUeq-lkp@intel.com/config)
compiler: gcc-9 (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/71b9b09529b207ce15667c1f5fba4b727b6754e6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 71b9b09529b207ce15667c1f5fba4b727b6754e6
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/mm/pat/ fs/fuse/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/mm/pat/set_memory.c:1935:5: warning: no previous prototype for 'set_mce_nospec' [-Wmissing-prototypes]
    1935 | int set_mce_nospec(unsigned long pfn, bool unmap)
         |     ^~~~~~~~~~~~~~
>> arch/x86/mm/pat/set_memory.c:1968:5: warning: no previous prototype for 'clear_mce_nospec' [-Wmissing-prototypes]
    1968 | int clear_mce_nospec(unsigned long pfn)
         |     ^~~~~~~~~~~~~~~~


vim +/set_mce_nospec +1935 arch/x86/mm/pat/set_memory.c

  1927	
  1928	#ifdef CONFIG_X86_64
  1929	/*
  1930	 * Prevent speculative access to the page by either unmapping
  1931	 * it (if we do not require access to any part of the page) or
  1932	 * marking it uncacheable (if we want to try to retrieve data
  1933	 * from non-poisoned lines in the page).
  1934	 */
> 1935	int set_mce_nospec(unsigned long pfn, bool unmap)
  1936	{
  1937		unsigned long decoy_addr;
  1938		int rc;
  1939	
  1940		/* SGX pages are not in the 1:1 map */
  1941		if (arch_is_platform_page(pfn << PAGE_SHIFT))
  1942			return 0;
  1943		/*
  1944		 * We would like to just call:
  1945		 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
  1946		 * but doing that would radically increase the odds of a
  1947		 * speculative access to the poison page because we'd have
  1948		 * the virtual address of the kernel 1:1 mapping sitting
  1949		 * around in registers.
  1950		 * Instead we get tricky.  We create a non-canonical address
  1951		 * that looks just like the one we want, but has bit 63 flipped.
  1952		 * This relies on set_memory_XX() properly sanitizing any __pa()
  1953		 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
  1954		 */
  1955		decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
  1956	
  1957		if (unmap)
  1958			rc = set_memory_np(decoy_addr, 1);
  1959		else
  1960			rc = set_memory_uc(decoy_addr, 1);
  1961		if (rc)
  1962			pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
  1963		return rc;
  1964	}
  1965	EXPORT_SYMBOL(set_mce_nospec);
  1966	
  1967	/* Restore full speculative operation to the pfn. */
> 1968	int clear_mce_nospec(unsigned long pfn)
  1969	{
  1970		return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
  1971	}
  1972	EXPORT_SYMBOL(clear_mce_nospec);
  1973	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-19  8:13     ` kernel test robot
  0 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:13 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: kbuild-all

Hi Jane,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nvdimm/libnvdimm-for-next]
[also build test WARNING on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: x86_64-randconfig-a015 (https://download.01.org/0day-ci/archive/20220319/202203191637.PK2oJUeq-lkp@intel.com/config)
compiler: gcc-9 (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/71b9b09529b207ce15667c1f5fba4b727b6754e6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 71b9b09529b207ce15667c1f5fba4b727b6754e6
        # save the config file to linux build tree
        mkdir build_dir
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/mm/pat/ fs/fuse/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/mm/pat/set_memory.c:1935:5: warning: no previous prototype for 'set_mce_nospec' [-Wmissing-prototypes]
    1935 | int set_mce_nospec(unsigned long pfn, bool unmap)
         |     ^~~~~~~~~~~~~~
>> arch/x86/mm/pat/set_memory.c:1968:5: warning: no previous prototype for 'clear_mce_nospec' [-Wmissing-prototypes]
    1968 | int clear_mce_nospec(unsigned long pfn)
         |     ^~~~~~~~~~~~~~~~


vim +/set_mce_nospec +1935 arch/x86/mm/pat/set_memory.c

  1927	
  1928	#ifdef CONFIG_X86_64
  1929	/*
  1930	 * Prevent speculative access to the page by either unmapping
  1931	 * it (if we do not require access to any part of the page) or
  1932	 * marking it uncacheable (if we want to try to retrieve data
  1933	 * from non-poisoned lines in the page).
  1934	 */
> 1935	int set_mce_nospec(unsigned long pfn, bool unmap)
  1936	{
  1937		unsigned long decoy_addr;
  1938		int rc;
  1939	
  1940		/* SGX pages are not in the 1:1 map */
  1941		if (arch_is_platform_page(pfn << PAGE_SHIFT))
  1942			return 0;
  1943		/*
  1944		 * We would like to just call:
  1945		 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
  1946		 * but doing that would radically increase the odds of a
  1947		 * speculative access to the poison page because we'd have
  1948		 * the virtual address of the kernel 1:1 mapping sitting
  1949		 * around in registers.
  1950		 * Instead we get tricky.  We create a non-canonical address
  1951		 * that looks just like the one we want, but has bit 63 flipped.
  1952		 * This relies on set_memory_XX() properly sanitizing any __pa()
  1953		 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
  1954		 */
  1955		decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
  1956	
  1957		if (unmap)
  1958			rc = set_memory_np(decoy_addr, 1);
  1959		else
  1960			rc = set_memory_uc(decoy_addr, 1);
  1961		if (rc)
  1962			pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
  1963		return rc;
  1964	}
  1965	EXPORT_SYMBOL(set_mce_nospec);
  1966	
  1967	/* Restore full speculative operation to the pfn. */
> 1968	int clear_mce_nospec(unsigned long pfn)
  1969	{
  1970		return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
  1971	}
  1972	EXPORT_SYMBOL(clear_mce_nospec);
  1973	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-19  8:24     ` kernel test robot
  -1 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:24 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: kbuild-all

Hi Jane,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nvdimm/libnvdimm-for-next]
[also build test WARNING on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: s390-randconfig-r044-20220317 (https://download.01.org/0day-ci/archive/20220319/202203191610.umg0CGkh-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/203570f765a6ad07eb5809850478a25a5257f7e2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 203570f765a6ad07eb5809850478a25a5257f7e2
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=s390 SHELL=/bin/bash fs/fuse/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   fs/fuse/virtio_fs.c: In function 'virtio_fs_zero_page_range':
>> fs/fuse/virtio_fs.c:774:58: warning: passing argument 4 of 'dax_direct_access' makes integer from pointer without a cast [-Wint-conversion]
     774 |         rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
         |                                                          ^~~~~~
         |                                                          |
         |                                                          void **
   In file included from fs/fuse/virtio_fs.c:8:
   include/linux/dax.h:187:21: note: expected 'int' but argument is of type 'void **'
     187 |                 int flags, void **kaddr, pfn_t *pfn);
         |                 ~~~~^~~~~
   fs/fuse/virtio_fs.c:774:14: error: too few arguments to function 'dax_direct_access'
     774 |         rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
         |              ^~~~~~~~~~~~~~~~~
   In file included from fs/fuse/virtio_fs.c:8:
   include/linux/dax.h:186:6: note: declared here
     186 | long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
         |      ^~~~~~~~~~~~~~~~~
   fs/fuse/virtio_fs.c: At top level:
   fs/fuse/virtio_fs.c:783:26: error: initialization of 'long int (*)(struct dax_device *, long unsigned int,  long int,  int,  void **, pfn_t *)' from incompatible pointer type 'long int (*)(struct dax_device *, long unsigned int,  long int,  void **, pfn_t *)' [-Werror=incompatible-pointer-types]
     783 |         .direct_access = virtio_fs_direct_access,
         |                          ^~~~~~~~~~~~~~~~~~~~~~~
   fs/fuse/virtio_fs.c:783:26: note: (near initialization for 'virtio_fs_dax_ops.direct_access')
   cc1: some warnings being treated as errors


vim +/dax_direct_access +774 fs/fuse/virtio_fs.c

22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  767  
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  768  static int virtio_fs_zero_page_range(struct dax_device *dax_dev,
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  769  				     pgoff_t pgoff, size_t nr_pages)
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  770  {
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  771  	long rc;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  772  	void *kaddr;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  773  
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19 @774  	rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  775  	if (rc < 0)
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  776  		return rc;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  777  	memset(kaddr, 0, nr_pages << PAGE_SHIFT);
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  778  	dax_flush(dax_dev, kaddr, nr_pages << PAGE_SHIFT);
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  779  	return 0;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  780  }
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  781  

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-19  8:24     ` kernel test robot
  0 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:24 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: kbuild-all

Hi Jane,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nvdimm/libnvdimm-for-next]
[also build test WARNING on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: s390-randconfig-r044-20220317 (https://download.01.org/0day-ci/archive/20220319/202203191610.umg0CGkh-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/203570f765a6ad07eb5809850478a25a5257f7e2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 203570f765a6ad07eb5809850478a25a5257f7e2
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=s390 SHELL=/bin/bash fs/fuse/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   fs/fuse/virtio_fs.c: In function 'virtio_fs_zero_page_range':
>> fs/fuse/virtio_fs.c:774:58: warning: passing argument 4 of 'dax_direct_access' makes integer from pointer without a cast [-Wint-conversion]
     774 |         rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
         |                                                          ^~~~~~
         |                                                          |
         |                                                          void **
   In file included from fs/fuse/virtio_fs.c:8:
   include/linux/dax.h:187:21: note: expected 'int' but argument is of type 'void **'
     187 |                 int flags, void **kaddr, pfn_t *pfn);
         |                 ~~~~^~~~~
   fs/fuse/virtio_fs.c:774:14: error: too few arguments to function 'dax_direct_access'
     774 |         rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
         |              ^~~~~~~~~~~~~~~~~
   In file included from fs/fuse/virtio_fs.c:8:
   include/linux/dax.h:186:6: note: declared here
     186 | long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
         |      ^~~~~~~~~~~~~~~~~
   fs/fuse/virtio_fs.c: At top level:
   fs/fuse/virtio_fs.c:783:26: error: initialization of 'long int (*)(struct dax_device *, long unsigned int,  long int,  int,  void **, pfn_t *)' from incompatible pointer type 'long int (*)(struct dax_device *, long unsigned int,  long int,  void **, pfn_t *)' [-Werror=incompatible-pointer-types]
     783 |         .direct_access = virtio_fs_direct_access,
         |                          ^~~~~~~~~~~~~~~~~~~~~~~
   fs/fuse/virtio_fs.c:783:26: note: (near initialization for 'virtio_fs_dax_ops.direct_access')
   cc1: some warnings being treated as errors


vim +/dax_direct_access +774 fs/fuse/virtio_fs.c

22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  767  
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  768  static int virtio_fs_zero_page_range(struct dax_device *dax_dev,
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  769  				     pgoff_t pgoff, size_t nr_pages)
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  770  {
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  771  	long rc;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  772  	void *kaddr;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  773  
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19 @774  	rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  775  	if (rc < 0)
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  776  		return rc;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  777  	memset(kaddr, 0, nr_pages << PAGE_SHIFT);
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  778  	dax_flush(dax_dev, kaddr, nr_pages << PAGE_SHIFT);
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  779  	return 0;
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  780  }
22f3787e9d95e7 Stefan Hajnoczi 2020-08-19  781  

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-19  8:24     ` kernel test robot
  -1 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:24 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: llvm, kbuild-all

Hi Jane,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nvdimm/libnvdimm-for-next]
[also build test WARNING on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: x86_64-randconfig-a012 (https://download.01.org/0day-ci/archive/20220319/202203191603.mQUQZZV5-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project a6e70e4056dff962ec634c5bd4f2f4105a0bef71)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/71b9b09529b207ce15667c1f5fba4b727b6754e6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 71b9b09529b207ce15667c1f5fba4b727b6754e6
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/mm/pat/ fs/fuse/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/mm/pat/set_memory.c:1935:5: warning: no previous prototype for function 'set_mce_nospec' [-Wmissing-prototypes]
   int set_mce_nospec(unsigned long pfn, bool unmap)
       ^
   arch/x86/mm/pat/set_memory.c:1935:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int set_mce_nospec(unsigned long pfn, bool unmap)
   ^
   static 
>> arch/x86/mm/pat/set_memory.c:1968:5: warning: no previous prototype for function 'clear_mce_nospec' [-Wmissing-prototypes]
   int clear_mce_nospec(unsigned long pfn)
       ^
   arch/x86/mm/pat/set_memory.c:1968:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int clear_mce_nospec(unsigned long pfn)
   ^
   static 
   2 warnings generated.


vim +/set_mce_nospec +1935 arch/x86/mm/pat/set_memory.c

  1927	
  1928	#ifdef CONFIG_X86_64
  1929	/*
  1930	 * Prevent speculative access to the page by either unmapping
  1931	 * it (if we do not require access to any part of the page) or
  1932	 * marking it uncacheable (if we want to try to retrieve data
  1933	 * from non-poisoned lines in the page).
  1934	 */
> 1935	int set_mce_nospec(unsigned long pfn, bool unmap)
  1936	{
  1937		unsigned long decoy_addr;
  1938		int rc;
  1939	
  1940		/* SGX pages are not in the 1:1 map */
  1941		if (arch_is_platform_page(pfn << PAGE_SHIFT))
  1942			return 0;
  1943		/*
  1944		 * We would like to just call:
  1945		 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
  1946		 * but doing that would radically increase the odds of a
  1947		 * speculative access to the poison page because we'd have
  1948		 * the virtual address of the kernel 1:1 mapping sitting
  1949		 * around in registers.
  1950		 * Instead we get tricky.  We create a non-canonical address
  1951		 * that looks just like the one we want, but has bit 63 flipped.
  1952		 * This relies on set_memory_XX() properly sanitizing any __pa()
  1953		 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
  1954		 */
  1955		decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
  1956	
  1957		if (unmap)
  1958			rc = set_memory_np(decoy_addr, 1);
  1959		else
  1960			rc = set_memory_uc(decoy_addr, 1);
  1961		if (rc)
  1962			pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
  1963		return rc;
  1964	}
  1965	EXPORT_SYMBOL(set_mce_nospec);
  1966	
  1967	/* Restore full speculative operation to the pfn. */
> 1968	int clear_mce_nospec(unsigned long pfn)
  1969	{
  1970		return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
  1971	}
  1972	EXPORT_SYMBOL(clear_mce_nospec);
  1973	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-19  8:24     ` kernel test robot
  0 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:24 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: llvm, kbuild-all

Hi Jane,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nvdimm/libnvdimm-for-next]
[also build test WARNING on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: x86_64-randconfig-a012 (https://download.01.org/0day-ci/archive/20220319/202203191603.mQUQZZV5-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project a6e70e4056dff962ec634c5bd4f2f4105a0bef71)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/71b9b09529b207ce15667c1f5fba4b727b6754e6
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 71b9b09529b207ce15667c1f5fba4b727b6754e6
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash arch/x86/mm/pat/ fs/fuse/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/mm/pat/set_memory.c:1935:5: warning: no previous prototype for function 'set_mce_nospec' [-Wmissing-prototypes]
   int set_mce_nospec(unsigned long pfn, bool unmap)
       ^
   arch/x86/mm/pat/set_memory.c:1935:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int set_mce_nospec(unsigned long pfn, bool unmap)
   ^
   static 
>> arch/x86/mm/pat/set_memory.c:1968:5: warning: no previous prototype for function 'clear_mce_nospec' [-Wmissing-prototypes]
   int clear_mce_nospec(unsigned long pfn)
       ^
   arch/x86/mm/pat/set_memory.c:1968:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int clear_mce_nospec(unsigned long pfn)
   ^
   static 
   2 warnings generated.


vim +/set_mce_nospec +1935 arch/x86/mm/pat/set_memory.c

  1927	
  1928	#ifdef CONFIG_X86_64
  1929	/*
  1930	 * Prevent speculative access to the page by either unmapping
  1931	 * it (if we do not require access to any part of the page) or
  1932	 * marking it uncacheable (if we want to try to retrieve data
  1933	 * from non-poisoned lines in the page).
  1934	 */
> 1935	int set_mce_nospec(unsigned long pfn, bool unmap)
  1936	{
  1937		unsigned long decoy_addr;
  1938		int rc;
  1939	
  1940		/* SGX pages are not in the 1:1 map */
  1941		if (arch_is_platform_page(pfn << PAGE_SHIFT))
  1942			return 0;
  1943		/*
  1944		 * We would like to just call:
  1945		 *      set_memory_XX((unsigned long)pfn_to_kaddr(pfn), 1);
  1946		 * but doing that would radically increase the odds of a
  1947		 * speculative access to the poison page because we'd have
  1948		 * the virtual address of the kernel 1:1 mapping sitting
  1949		 * around in registers.
  1950		 * Instead we get tricky.  We create a non-canonical address
  1951		 * that looks just like the one we want, but has bit 63 flipped.
  1952		 * This relies on set_memory_XX() properly sanitizing any __pa()
  1953		 * results with __PHYSICAL_MASK or PTE_PFN_MASK.
  1954		 */
  1955		decoy_addr = (pfn << PAGE_SHIFT) + (PAGE_OFFSET ^ BIT(63));
  1956	
  1957		if (unmap)
  1958			rc = set_memory_np(decoy_addr, 1);
  1959		else
  1960			rc = set_memory_uc(decoy_addr, 1);
  1961		if (rc)
  1962			pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
  1963		return rc;
  1964	}
  1965	EXPORT_SYMBOL(set_mce_nospec);
  1966	
  1967	/* Restore full speculative operation to the pfn. */
> 1968	int clear_mce_nospec(unsigned long pfn)
  1969	{
  1970		return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
  1971	}
  1972	EXPORT_SYMBOL(clear_mce_nospec);
  1973	

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-19  8:44     ` kernel test robot
  -1 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:44 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: llvm, kbuild-all

Hi Jane,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on nvdimm/libnvdimm-for-next]
[also build test ERROR on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: s390-randconfig-r044-20220318 (https://download.01.org/0day-ci/archive/20220319/202203191635.vyoiL17f-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project a6e70e4056dff962ec634c5bd4f2f4105a0bef71)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/0day-ci/linux/commit/203570f765a6ad07eb5809850478a25a5257f7e2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 203570f765a6ad07eb5809850478a25a5257f7e2
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash drivers/s390/block/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from drivers/s390/block/dcssblk.c:24:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
                                                        ^
   In file included from drivers/s390/block/dcssblk.c:24:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
   #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
                                                        ^
   In file included from drivers/s390/block/dcssblk.c:24:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> drivers/s390/block/dcssblk.c:53:63: error: too few arguments to function call, expected 6, have 5
           rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
                ~~~~~~~~~~~~~~~~~                                       ^
   include/linux/dax.h:186:6: note: 'dax_direct_access' declared here
   long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
        ^
   12 warnings and 1 error generated.


vim +53 drivers/s390/block/dcssblk.c

7a2765f6e82063f Dan Williams 2017-01-26  46  
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  47  static int dcssblk_dax_zero_page_range(struct dax_device *dax_dev,
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  48  				       pgoff_t pgoff, size_t nr_pages)
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  49  {
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  50  	long rc;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  51  	void *kaddr;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  52  
79fa974ff6bc3f7 Vivek Goyal  2020-02-28 @53  	rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  54  	if (rc < 0)
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  55  		return rc;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  56  	memset(kaddr, 0, nr_pages << PAGE_SHIFT);
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  57  	dax_flush(dax_dev, kaddr, nr_pages << PAGE_SHIFT);
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  58  	return 0;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  59  }
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  60  

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-19  8:44     ` kernel test robot
  0 siblings, 0 replies; 52+ messages in thread
From: kernel test robot @ 2022-03-19  8:44 UTC (permalink / raw)
  To: Jane Chu, david, djwong, dan.j.williams, hch, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs
  Cc: llvm, kbuild-all

Hi Jane,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on nvdimm/libnvdimm-for-next]
[also build test ERROR on device-mapper-dm/for-next linus/master v5.17-rc8 next-20220318]
[cannot apply to tip/x86/mm]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Jane-Chu/DAX-poison-recovery/20220319-143144
base:   https://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm.git libnvdimm-for-next
config: s390-randconfig-r044-20220318 (https://download.01.org/0day-ci/archive/20220319/202203191635.vyoiL17f-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project a6e70e4056dff962ec634c5bd4f2f4105a0bef71)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install s390 cross compiling tool for clang build
        # apt-get install binutils-s390x-linux-gnu
        # https://github.com/0day-ci/linux/commit/203570f765a6ad07eb5809850478a25a5257f7e2
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Jane-Chu/DAX-poison-recovery/20220319-143144
        git checkout 203570f765a6ad07eb5809850478a25a5257f7e2
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash drivers/s390/block/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from drivers/s390/block/dcssblk.c:24:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:464:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:477:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
                                                             ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
   #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
                                                        ^
   In file included from drivers/s390/block/dcssblk.c:24:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:490:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
                                                             ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
   #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
                                                        ^
   In file included from drivers/s390/block/dcssblk.c:24:
   In file included from arch/s390/include/asm/io.h:75:
   include/asm-generic/io.h:501:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:511:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:521:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:609:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsb(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:617:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsw(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:625:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           readsl(PCI_IOBASE + addr, buffer, count);
                  ~~~~~~~~~~ ^
   include/asm-generic/io.h:634:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesb(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:643:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesw(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
   include/asm-generic/io.h:652:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           writesl(PCI_IOBASE + addr, buffer, count);
                   ~~~~~~~~~~ ^
>> drivers/s390/block/dcssblk.c:53:63: error: too few arguments to function call, expected 6, have 5
           rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
                ~~~~~~~~~~~~~~~~~                                       ^
   include/linux/dax.h:186:6: note: 'dax_direct_access' declared here
   long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
        ^
   12 warnings and 1 error generated.


vim +53 drivers/s390/block/dcssblk.c

7a2765f6e82063f Dan Williams 2017-01-26  46  
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  47  static int dcssblk_dax_zero_page_range(struct dax_device *dax_dev,
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  48  				       pgoff_t pgoff, size_t nr_pages)
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  49  {
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  50  	long rc;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  51  	void *kaddr;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  52  
79fa974ff6bc3f7 Vivek Goyal  2020-02-28 @53  	rc = dax_direct_access(dax_dev, pgoff, nr_pages, &kaddr, NULL);
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  54  	if (rc < 0)
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  55  		return rc;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  56  	memset(kaddr, 0, nr_pages << PAGE_SHIFT);
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  57  	dax_flush(dax_dev, kaddr, nr_pages << PAGE_SHIFT);
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  58  	return 0;
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  59  }
79fa974ff6bc3f7 Vivek Goyal  2020-02-28  60  

---
0-DAY CI Kernel Test Service
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 1/6] x86/mm: fix comment
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-22  8:40     ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:40 UTC (permalink / raw)
  To: Jane Chu
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On Sat, Mar 19, 2022 at 12:28:28AM -0600, Jane Chu wrote:
> There is no _set_memory_prot internal helper, while coming across
> the code, might as well fix the comment.
> 
> Signed-off-by: Jane Chu <jane.chu@oracle.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 1/6] x86/mm: fix comment
@ 2022-03-22  8:40     ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:40 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

On Sat, Mar 19, 2022 at 12:28:28AM -0600, Jane Chu wrote:
> There is no _set_memory_prot internal helper, while coming across
> the code, might as well fix the comment.
> 
> Signed-off-by: Jane Chu <jane.chu@oracle.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-22  8:42     ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:42 UTC (permalink / raw)
  To: Jane Chu
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

> +EXPORT_SYMBOL(set_mce_nospec);

No need for this export at all.

> +
> +/* Restore full speculative operation to the pfn. */
> +int clear_mce_nospec(unsigned long pfn)
> +{
> +	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
> +}
> +EXPORT_SYMBOL(clear_mce_nospec);

And this should be EXPORT_SYMBOL_GPL.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-22  8:42     ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:42 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

> +EXPORT_SYMBOL(set_mce_nospec);

No need for this export at all.

> +
> +/* Restore full speculative operation to the pfn. */
> +int clear_mce_nospec(unsigned long pfn)
> +{
> +	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
> +}
> +EXPORT_SYMBOL(clear_mce_nospec);

And this should be EXPORT_SYMBOL_GPL.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-22  8:44     ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:44 UTC (permalink / raw)
  To: Jane Chu
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On Sat, Mar 19, 2022 at 12:28:30AM -0600, Jane Chu wrote:
> Mark poisoned page as not present, and to reverse the 'np' effect,
> restate the _PAGE_PRESENT bit. Please refer to discussions here for
> reason behind the decision.
> https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/

I think it would be good to summarize the conclusion here instead of
just linking to it.

> +static int _set_memory_present(unsigned long addr, int numpages)
> +{
> +	return change_page_attr_set(&addr, numpages, __pgprot(_PAGE_PRESENT), 0);
> +}

What is the point of this trivial helper with a single caller?


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page
@ 2022-03-22  8:44     ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:44 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

On Sat, Mar 19, 2022 at 12:28:30AM -0600, Jane Chu wrote:
> Mark poisoned page as not present, and to reverse the 'np' effect,
> restate the _PAGE_PRESENT bit. Please refer to discussions here for
> reason behind the decision.
> https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/

I think it would be good to summarize the conclusion here instead of
just linking to it.

> +static int _set_memory_present(unsigned long addr, int numpages)
> +{
> +	return change_page_attr_set(&addr, numpages, __pgprot(_PAGE_PRESENT), 0);
> +}

What is the point of this trivial helper with a single caller?

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 5/6] pmem: refactor pmem_clear_poison()
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-22  8:53     ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:53 UTC (permalink / raw)
  To: Jane Chu
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

> -static void hwpoison_clear(struct pmem_device *pmem,
> -		phys_addr_t phys, unsigned int len)
> +static phys_addr_t to_phys(struct pmem_device *pmem, phys_addr_t offset)
>  {
> +	return pmem->phys_addr + offset;
> +}
> +
> +static sector_t to_sect(struct pmem_device *pmem, phys_addr_t offset)
> +{
> +	return (offset - pmem->data_offset) / 512;
> +}
> +
> +static phys_addr_t to_offset(struct pmem_device *pmem, sector_t sector)
> +{
> +	return sector * 512 + pmem->data_offset;
> +}

The multiplicate / divison using 512 could use shifts using
SECTOR_SHIFT.

> +
> +static void clear_hwpoison(struct pmem_device *pmem, phys_addr_t offset,
> +		unsigned int len)

> +static void clear_bb(struct pmem_device *pmem, sector_t sector, long blks)

All these functions lack a pmem_ prefix.

> +static blk_status_t __pmem_clear_poison(struct pmem_device *pmem,
> +		phys_addr_t offset, unsigned int len,
> +		unsigned int *blks)
> +{
> +	phys_addr_t phys = to_phys(pmem, offset);
>  	long cleared;
> +	blk_status_t rc;
>  
> +	cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
> +	*blks = cleared / 512;
> +	rc = (cleared < len) ? BLK_STS_IOERR : BLK_STS_OK;
> +	if (cleared <= 0 || *blks == 0)
> +		return rc;

This look odd.  I think just returing the cleared byte value would
make much more sense:

static long __pmem_clear_poison(struct pmem_device *pmem,
		phys_addr_t offset, unsigned int len)
{
	long cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);

	if (cleared > 0) {
		clear_hwpoison(pmem, offset, cleared);
		arch_invalidate_pmem(pmem->virt_addr + offset, len);
	}

	return cleared;
}

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 5/6] pmem: refactor pmem_clear_poison()
@ 2022-03-22  8:53     ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  8:53 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

> -static void hwpoison_clear(struct pmem_device *pmem,
> -		phys_addr_t phys, unsigned int len)
> +static phys_addr_t to_phys(struct pmem_device *pmem, phys_addr_t offset)
>  {
> +	return pmem->phys_addr + offset;
> +}
> +
> +static sector_t to_sect(struct pmem_device *pmem, phys_addr_t offset)
> +{
> +	return (offset - pmem->data_offset) / 512;
> +}
> +
> +static phys_addr_t to_offset(struct pmem_device *pmem, sector_t sector)
> +{
> +	return sector * 512 + pmem->data_offset;
> +}

The multiplicate / divison using 512 could use shifts using
SECTOR_SHIFT.

> +
> +static void clear_hwpoison(struct pmem_device *pmem, phys_addr_t offset,
> +		unsigned int len)

> +static void clear_bb(struct pmem_device *pmem, sector_t sector, long blks)

All these functions lack a pmem_ prefix.

> +static blk_status_t __pmem_clear_poison(struct pmem_device *pmem,
> +		phys_addr_t offset, unsigned int len,
> +		unsigned int *blks)
> +{
> +	phys_addr_t phys = to_phys(pmem, offset);
>  	long cleared;
> +	blk_status_t rc;
>  
> +	cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
> +	*blks = cleared / 512;
> +	rc = (cleared < len) ? BLK_STS_IOERR : BLK_STS_OK;
> +	if (cleared <= 0 || *blks == 0)
> +		return rc;

This look odd.  I think just returing the cleared byte value would
make much more sense:

static long __pmem_clear_poison(struct pmem_device *pmem,
		phys_addr_t offset, unsigned int len)
{
	long cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);

	if (cleared > 0) {
		clear_hwpoison(pmem, offset, cleared);
		arch_invalidate_pmem(pmem->virt_addr + offset, len);
	}

	return cleared;
}

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-22  9:01     ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  9:01 UTC (permalink / raw)
  To: Jane Chu
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On Sat, Mar 19, 2022 at 12:28:31AM -0600, Jane Chu wrote:
> Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is
> not set by default in dax_direct_access() such that the helper
> does not translate a pmem range to kernel virtual address if the
> range contains uncorrectable errors.  When the flag is set,
> the helper ignores the UEs and return kernel virtual adderss so
> that the caller may get on with data recovery via write.

This DAX_RECOVERY doesn't actually seem to be used anywhere here or
in the subsequent patches.  Did I miss something?

> Also introduce a new dev_pagemap_ops .recovery_write function.
> The function is applicable to FSDAX device only. The device
> page backend driver provides .recovery_write function if the
> device has underlying mechanism to clear the uncorrectable
> errors on the fly.

Why is this not in struct dax_operations?

>  
> +size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
> +		void *addr, size_t bytes, struct iov_iter *iter)
> +{
> +	struct dev_pagemap *pgmap = dax_dev->pgmap;
> +
> +	if (!pgmap || !pgmap->ops->recovery_write)
> +		return -EIO;
> +	return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes,
> +				(void *)iter);

No need to cast a type pointer to a void pointer.  But more importantly
losing the type information here and passing it as void seems very
wrong.

> +static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
> +		void *addr, size_t bytes, void *iter)
> +{
> +	struct pmem_device *pmem = pgmap->owner;
> +
> +	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
> +
> +	/* XXX more later */
> +	return 0;
> +}

This shuld not be added here - the core code can cope with a NULL
method just fine.

> +		recov = 0;
> +		flags = 0;
> +		nrpg = PHYS_PFN(size);

Please spell out the words.  The recovery flag can also be
a bool to make the code more readable.

> +		map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags,
> +					&kaddr, NULL);
> +		if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {

No need for the inner braces.

> +			flags |= DAX_RECOVERY;
> +			map_len = dax_direct_access(dax_dev, pgoff, nrpg,
> +						flags, &kaddr, NULL);

And noneed for the flags variable at all really.

>  			xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,
>  					map_len, iter);
>  		else
> @@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
>  		length -= xfer;
>  		done += xfer;
>  
> +		if (recov && (xfer == (ssize_t) -EIO)) {
> +			pr_warn("dax_recovery_write failed\n");
> +			ret = -EIO;
> +			break;

And no, we can't just use an unsigned variable to communicate a
negative error code.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-22  9:01     ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-22  9:01 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

On Sat, Mar 19, 2022 at 12:28:31AM -0600, Jane Chu wrote:
> Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is
> not set by default in dax_direct_access() such that the helper
> does not translate a pmem range to kernel virtual address if the
> range contains uncorrectable errors.  When the flag is set,
> the helper ignores the UEs and return kernel virtual adderss so
> that the caller may get on with data recovery via write.

This DAX_RECOVERY doesn't actually seem to be used anywhere here or
in the subsequent patches.  Did I miss something?

> Also introduce a new dev_pagemap_ops .recovery_write function.
> The function is applicable to FSDAX device only. The device
> page backend driver provides .recovery_write function if the
> device has underlying mechanism to clear the uncorrectable
> errors on the fly.

Why is this not in struct dax_operations?

>  
> +size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
> +		void *addr, size_t bytes, struct iov_iter *iter)
> +{
> +	struct dev_pagemap *pgmap = dax_dev->pgmap;
> +
> +	if (!pgmap || !pgmap->ops->recovery_write)
> +		return -EIO;
> +	return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes,
> +				(void *)iter);

No need to cast a type pointer to a void pointer.  But more importantly
losing the type information here and passing it as void seems very
wrong.

> +static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
> +		void *addr, size_t bytes, void *iter)
> +{
> +	struct pmem_device *pmem = pgmap->owner;
> +
> +	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
> +
> +	/* XXX more later */
> +	return 0;
> +}

This shuld not be added here - the core code can cope with a NULL
method just fine.

> +		recov = 0;
> +		flags = 0;
> +		nrpg = PHYS_PFN(size);

Please spell out the words.  The recovery flag can also be
a bool to make the code more readable.

> +		map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags,
> +					&kaddr, NULL);
> +		if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {

No need for the inner braces.

> +			flags |= DAX_RECOVERY;
> +			map_len = dax_direct_access(dax_dev, pgoff, nrpg,
> +						flags, &kaddr, NULL);

And noneed for the flags variable at all really.

>  			xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,
>  					map_len, iter);
>  		else
> @@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
>  		length -= xfer;
>  		done += xfer;
>  
> +		if (recov && (xfer == (ssize_t) -EIO)) {
> +			pr_warn("dax_recovery_write failed\n");
> +			ret = -EIO;
> +			break;

And no, we can't just use an unsigned variable to communicate a
negative error code.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-22  8:42     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 22:19       ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 22:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: david, djwong, dan.j.williams, vishal.l.verma, dave.jiang, agk,
	snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 1:42 AM, Christoph Hellwig wrote:
>> +EXPORT_SYMBOL(set_mce_nospec);
> 
> No need for this export at all.

Indeed, my bad, will remove it.

> 
>> +
>> +/* Restore full speculative operation to the pfn. */
>> +int clear_mce_nospec(unsigned long pfn)
>> +{
>> +	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
>> +}
>> +EXPORT_SYMBOL(clear_mce_nospec);
> 
> And this should be EXPORT_SYMBOL_GPL.

Yes.

Thanks!
-jane


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-22 22:19       ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 22:19 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	linux-xfs, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, agk

On 3/22/2022 1:42 AM, Christoph Hellwig wrote:
>> +EXPORT_SYMBOL(set_mce_nospec);
> 
> No need for this export at all.

Indeed, my bad, will remove it.

> 
>> +
>> +/* Restore full speculative operation to the pfn. */
>> +int clear_mce_nospec(unsigned long pfn)
>> +{
>> +	return set_memory_wb((unsigned long) pfn_to_kaddr(pfn), 1);
>> +}
>> +EXPORT_SYMBOL(clear_mce_nospec);
> 
> And this should be EXPORT_SYMBOL_GPL.

Yes.

Thanks!
-jane

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-19  6:28   ` [dm-devel] " Jane Chu
@ 2022-03-22 22:41     ` Borislav Petkov
  -1 siblings, 0 replies; 52+ messages in thread
From: Borislav Petkov @ 2022-03-22 22:41 UTC (permalink / raw)
  To: Jane Chu
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On Sat, Mar 19, 2022 at 12:28:29AM -0600, Jane Chu wrote:
> Relocate the twin mce functions to arch/x86/mm/pat/set_memory.c
> file where they belong.
> 
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
>  arch/x86/include/asm/set_memory.h | 52 -------------------------------
>  arch/x86/mm/pat/set_memory.c      | 48 ++++++++++++++++++++++++++++
>  include/linux/set_memory.h        |  9 +++---
>  3 files changed, 53 insertions(+), 56 deletions(-)

For the future, please use get_maintainers.pl so that you know who to Cc
on patches. In this particular case, patches touching arch/x86/ should
Cc x86@kernel.org

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-22 22:41     ` Borislav Petkov
  0 siblings, 0 replies; 52+ messages in thread
From: Borislav Petkov @ 2022-03-22 22:41 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

On Sat, Mar 19, 2022 at 12:28:29AM -0600, Jane Chu wrote:
> Relocate the twin mce functions to arch/x86/mm/pat/set_memory.c
> file where they belong.
> 
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
>  arch/x86/include/asm/set_memory.h | 52 -------------------------------
>  arch/x86/mm/pat/set_memory.c      | 48 ++++++++++++++++++++++++++++
>  include/linux/set_memory.h        |  9 +++---
>  3 files changed, 53 insertions(+), 56 deletions(-)

For the future, please use get_maintainers.pl so that you know who to Cc
on patches. In this particular case, patches touching arch/x86/ should
Cc x86@kernel.org

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page
  2022-03-22  8:44     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 22:45       ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 22:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: david, djwong, dan.j.williams, vishal.l.verma, dave.jiang, agk,
	snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 1:44 AM, Christoph Hellwig wrote:
> On Sat, Mar 19, 2022 at 12:28:30AM -0600, Jane Chu wrote:
>> Mark poisoned page as not present, and to reverse the 'np' effect,
>> restate the _PAGE_PRESENT bit. Please refer to discussions here for
>> reason behind the decision.
>> https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/
> 
> I think it would be good to summarize the conclusion here instead of
> just linking to it.

Sure, will do.

> 
>> +static int _set_memory_present(unsigned long addr, int numpages)
>> +{
>> +	return change_page_attr_set(&addr, numpages, __pgprot(_PAGE_PRESENT), 0);
>> +}
> 
> What is the point of this trivial helper with a single caller?
> 

Okay, will remove the helper.

thanks!
-jane



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page
@ 2022-03-22 22:45       ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 22:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	linux-xfs, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, agk

On 3/22/2022 1:44 AM, Christoph Hellwig wrote:
> On Sat, Mar 19, 2022 at 12:28:30AM -0600, Jane Chu wrote:
>> Mark poisoned page as not present, and to reverse the 'np' effect,
>> restate the _PAGE_PRESENT bit. Please refer to discussions here for
>> reason behind the decision.
>> https://lore.kernel.org/all/CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com/
> 
> I think it would be good to summarize the conclusion here instead of
> just linking to it.

Sure, will do.

> 
>> +static int _set_memory_present(unsigned long addr, int numpages)
>> +{
>> +	return change_page_attr_set(&addr, numpages, __pgprot(_PAGE_PRESENT), 0);
>> +}
> 
> What is the point of this trivial helper with a single caller?
> 

Okay, will remove the helper.

thanks!
-jane


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-22  9:01     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 23:05       ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 23:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: david, djwong, dan.j.williams, vishal.l.verma, dave.jiang, agk,
	snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 2:01 AM, Christoph Hellwig wrote:
> On Sat, Mar 19, 2022 at 12:28:31AM -0600, Jane Chu wrote:
>> Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is
>> not set by default in dax_direct_access() such that the helper
>> does not translate a pmem range to kernel virtual address if the
>> range contains uncorrectable errors.  When the flag is set,
>> the helper ignores the UEs and return kernel virtual adderss so
>> that the caller may get on with data recovery via write.
> 
> This DAX_RECOVERY doesn't actually seem to be used anywhere here or
> in the subsequent patches.  Did I miss something?

dax_iomap_iter() uses the flag in the same patch,
+               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
+                       flags |= DAX_RECOVERY;
+                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
+                                               flags, &kaddr, NULL);


> 
>> Also introduce a new dev_pagemap_ops .recovery_write function.
>> The function is applicable to FSDAX device only. The device
>> page backend driver provides .recovery_write function if the
>> device has underlying mechanism to clear the uncorrectable
>> errors on the fly.
> 
> Why is this not in struct dax_operations?

Per Dan's comments to the v5 series, adding .recovery_write to
dax_operations causes a number of trivial dm targets changes.
Dan suggested that adding .recovery_write to pagemap_ops could
cut short the logistics of figuring out whether the driver backing
up a page is indeed capable of clearing poison. Please see
https://lkml.org/lkml/2022/2/4/31

> 
>>   
>> +size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
>> +		void *addr, size_t bytes, struct iov_iter *iter)
>> +{
>> +	struct dev_pagemap *pgmap = dax_dev->pgmap;
>> +
>> +	if (!pgmap || !pgmap->ops->recovery_write)
>> +		return -EIO;
>> +	return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes,
>> +				(void *)iter);
> 
> No need to cast a type pointer to a void pointer.  But more importantly
> losing the type information here and passing it as void seems very
> wrong.

include/linux/memremap.h doesn't know struct iov_iter which is defined 
in include/linux/uio.h,  would you prefer to adding include/linux/uio.h 
to include/linux/memremap.h ?

> 
>> +static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
>> +		void *addr, size_t bytes, void *iter)
>> +{
>> +	struct pmem_device *pmem = pgmap->owner;
>> +
>> +	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
>> +
>> +	/* XXX more later */
>> +	return 0;
>> +}
> 
> This shuld not be added here - the core code can cope with a NULL
> method just fine.

Okay, will remove the XXX line.

> 
>> +		recov = 0;
>> +		flags = 0;
>> +		nrpg = PHYS_PFN(size);
> 
> Please spell out the words.  The recovery flag can also be
> a bool to make the code more readable.

Sure.

> 
>> +		map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags,
>> +					&kaddr, NULL);
>> +		if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
> 
> No need for the inner braces.

Okay.

> 
>> +			flags |= DAX_RECOVERY;
>> +			map_len = dax_direct_access(dax_dev, pgoff, nrpg,
>> +						flags, &kaddr, NULL);
> 
> And noneed for the flags variable at all really.

Okay.
> 
>>   			xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,
>>   					map_len, iter);
>>   		else
>> @@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
>>   		length -= xfer;
>>   		done += xfer;
>>   
>> +		if (recov && (xfer == (ssize_t) -EIO)) {
>> +			pr_warn("dax_recovery_write failed\n");
>> +			ret = -EIO;
>> +			break;
> 
> And no, we can't just use an unsigned variable to communicate a
> negative error code.

Okay, will have dax_recovery_write return 0 in all error cases.

thanks!
-jane


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-22 23:05       ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 23:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	linux-xfs, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, agk

On 3/22/2022 2:01 AM, Christoph Hellwig wrote:
> On Sat, Mar 19, 2022 at 12:28:31AM -0600, Jane Chu wrote:
>> Introduce DAX_RECOVERY flag to dax_direct_access(). The flag is
>> not set by default in dax_direct_access() such that the helper
>> does not translate a pmem range to kernel virtual address if the
>> range contains uncorrectable errors.  When the flag is set,
>> the helper ignores the UEs and return kernel virtual adderss so
>> that the caller may get on with data recovery via write.
> 
> This DAX_RECOVERY doesn't actually seem to be used anywhere here or
> in the subsequent patches.  Did I miss something?

dax_iomap_iter() uses the flag in the same patch,
+               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
+                       flags |= DAX_RECOVERY;
+                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
+                                               flags, &kaddr, NULL);


> 
>> Also introduce a new dev_pagemap_ops .recovery_write function.
>> The function is applicable to FSDAX device only. The device
>> page backend driver provides .recovery_write function if the
>> device has underlying mechanism to clear the uncorrectable
>> errors on the fly.
> 
> Why is this not in struct dax_operations?

Per Dan's comments to the v5 series, adding .recovery_write to
dax_operations causes a number of trivial dm targets changes.
Dan suggested that adding .recovery_write to pagemap_ops could
cut short the logistics of figuring out whether the driver backing
up a page is indeed capable of clearing poison. Please see
https://lkml.org/lkml/2022/2/4/31

> 
>>   
>> +size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
>> +		void *addr, size_t bytes, struct iov_iter *iter)
>> +{
>> +	struct dev_pagemap *pgmap = dax_dev->pgmap;
>> +
>> +	if (!pgmap || !pgmap->ops->recovery_write)
>> +		return -EIO;
>> +	return pgmap->ops->recovery_write(pgmap, pgoff, addr, bytes,
>> +				(void *)iter);
> 
> No need to cast a type pointer to a void pointer.  But more importantly
> losing the type information here and passing it as void seems very
> wrong.

include/linux/memremap.h doesn't know struct iov_iter which is defined 
in include/linux/uio.h,  would you prefer to adding include/linux/uio.h 
to include/linux/memremap.h ?

> 
>> +static size_t pmem_recovery_write(struct dev_pagemap *pgmap, pgoff_t pgoff,
>> +		void *addr, size_t bytes, void *iter)
>> +{
>> +	struct pmem_device *pmem = pgmap->owner;
>> +
>> +	dev_warn(pmem->bb.dev, "%s: not yet implemented\n", __func__);
>> +
>> +	/* XXX more later */
>> +	return 0;
>> +}
> 
> This shuld not be added here - the core code can cope with a NULL
> method just fine.

Okay, will remove the XXX line.

> 
>> +		recov = 0;
>> +		flags = 0;
>> +		nrpg = PHYS_PFN(size);
> 
> Please spell out the words.  The recovery flag can also be
> a bool to make the code more readable.

Sure.

> 
>> +		map_len = dax_direct_access(dax_dev, pgoff, nrpg, flags,
>> +					&kaddr, NULL);
>> +		if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
> 
> No need for the inner braces.

Okay.

> 
>> +			flags |= DAX_RECOVERY;
>> +			map_len = dax_direct_access(dax_dev, pgoff, nrpg,
>> +						flags, &kaddr, NULL);
> 
> And noneed for the flags variable at all really.

Okay.
> 
>>   			xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,
>>   					map_len, iter);
>>   		else
>> @@ -1271,6 +1286,11 @@ static loff_t dax_iomap_iter(const struct iomap_iter *iomi,
>>   		length -= xfer;
>>   		done += xfer;
>>   
>> +		if (recov && (xfer == (ssize_t) -EIO)) {
>> +			pr_warn("dax_recovery_write failed\n");
>> +			ret = -EIO;
>> +			break;
> 
> And no, we can't just use an unsigned variable to communicate a
> negative error code.

Okay, will have dax_recovery_write return 0 in all error cases.

thanks!
-jane

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 5/6] pmem: refactor pmem_clear_poison()
  2022-03-22  8:53     ` [dm-devel] " Christoph Hellwig
@ 2022-03-22 23:45       ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 23:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: david, djwong, dan.j.williams, vishal.l.verma, dave.jiang, agk,
	snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 1:53 AM, Christoph Hellwig wrote:
>> -static void hwpoison_clear(struct pmem_device *pmem,
>> -		phys_addr_t phys, unsigned int len)
>> +static phys_addr_t to_phys(struct pmem_device *pmem, phys_addr_t offset)
>>   {
>> +	return pmem->phys_addr + offset;
>> +}
>> +
>> +static sector_t to_sect(struct pmem_device *pmem, phys_addr_t offset)
>> +{
>> +	return (offset - pmem->data_offset) / 512;
>> +}
>> +
>> +static phys_addr_t to_offset(struct pmem_device *pmem, sector_t sector)
>> +{
>> +	return sector * 512 + pmem->data_offset;
>> +}
> 
> The multiplicate / divison using 512 could use shifts using
> SECTOR_SHIFT.

Nice, will do.

> 
>> +
>> +static void clear_hwpoison(struct pmem_device *pmem, phys_addr_t offset,
>> +		unsigned int len)
> 
>> +static void clear_bb(struct pmem_device *pmem, sector_t sector, long blks)
> 
> All these functions lack a pmem_ prefix.

Did you mean all of the helpers or just "clear_hwpoison" and "clear_bb"? 
   The reason I ask is that there are existing static helpers without 
pmem_ prefix, just short function names.

> 
>> +static blk_status_t __pmem_clear_poison(struct pmem_device *pmem,
>> +		phys_addr_t offset, unsigned int len,
>> +		unsigned int *blks)
>> +{
>> +	phys_addr_t phys = to_phys(pmem, offset);
>>   	long cleared;
>> +	blk_status_t rc;
>>   
>> +	cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
>> +	*blks = cleared / 512;
>> +	rc = (cleared < len) ? BLK_STS_IOERR : BLK_STS_OK;
>> +	if (cleared <= 0 || *blks == 0)
>> +		return rc;
> 
> This look odd.  I think just returing the cleared byte value would
> make much more sense:
> 
> static long __pmem_clear_poison(struct pmem_device *pmem,
> 		phys_addr_t offset, unsigned int len)
> {
> 	long cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
> 
> 	if (cleared > 0) {
> 		clear_hwpoison(pmem, offset, cleared);
> 		arch_invalidate_pmem(pmem->virt_addr + offset, len);
> 	}
> 
> 	return cleared;
> }

Yes, this is cleaner, will do.

Thanks!
-jane



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 5/6] pmem: refactor pmem_clear_poison()
@ 2022-03-22 23:45       ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 23:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	linux-xfs, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, agk

On 3/22/2022 1:53 AM, Christoph Hellwig wrote:
>> -static void hwpoison_clear(struct pmem_device *pmem,
>> -		phys_addr_t phys, unsigned int len)
>> +static phys_addr_t to_phys(struct pmem_device *pmem, phys_addr_t offset)
>>   {
>> +	return pmem->phys_addr + offset;
>> +}
>> +
>> +static sector_t to_sect(struct pmem_device *pmem, phys_addr_t offset)
>> +{
>> +	return (offset - pmem->data_offset) / 512;
>> +}
>> +
>> +static phys_addr_t to_offset(struct pmem_device *pmem, sector_t sector)
>> +{
>> +	return sector * 512 + pmem->data_offset;
>> +}
> 
> The multiplicate / divison using 512 could use shifts using
> SECTOR_SHIFT.

Nice, will do.

> 
>> +
>> +static void clear_hwpoison(struct pmem_device *pmem, phys_addr_t offset,
>> +		unsigned int len)
> 
>> +static void clear_bb(struct pmem_device *pmem, sector_t sector, long blks)
> 
> All these functions lack a pmem_ prefix.

Did you mean all of the helpers or just "clear_hwpoison" and "clear_bb"? 
   The reason I ask is that there are existing static helpers without 
pmem_ prefix, just short function names.

> 
>> +static blk_status_t __pmem_clear_poison(struct pmem_device *pmem,
>> +		phys_addr_t offset, unsigned int len,
>> +		unsigned int *blks)
>> +{
>> +	phys_addr_t phys = to_phys(pmem, offset);
>>   	long cleared;
>> +	blk_status_t rc;
>>   
>> +	cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
>> +	*blks = cleared / 512;
>> +	rc = (cleared < len) ? BLK_STS_IOERR : BLK_STS_OK;
>> +	if (cleared <= 0 || *blks == 0)
>> +		return rc;
> 
> This look odd.  I think just returing the cleared byte value would
> make much more sense:
> 
> static long __pmem_clear_poison(struct pmem_device *pmem,
> 		phys_addr_t offset, unsigned int len)
> {
> 	long cleared = nvdimm_clear_poison(to_dev(pmem), phys, len);
> 
> 	if (cleared > 0) {
> 		clear_hwpoison(pmem, offset, cleared);
> 		arch_invalidate_pmem(pmem->virt_addr + offset, len);
> 	}
> 
> 	return cleared;
> }

Yes, this is cleaner, will do.

Thanks!
-jane


--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
  2022-03-22 22:41     ` [dm-devel] " Borislav Petkov
@ 2022-03-22 23:48       ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 23:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: david, djwong, dan.j.williams, hch, vishal.l.verma, dave.jiang,
	agk, snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 3:41 PM, Borislav Petkov wrote:
> On Sat, Mar 19, 2022 at 12:28:29AM -0600, Jane Chu wrote:
>> Relocate the twin mce functions to arch/x86/mm/pat/set_memory.c
>> file where they belong.
>>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> ---
>>   arch/x86/include/asm/set_memory.h | 52 -------------------------------
>>   arch/x86/mm/pat/set_memory.c      | 48 ++++++++++++++++++++++++++++
>>   include/linux/set_memory.h        |  9 +++---
>>   3 files changed, 53 insertions(+), 56 deletions(-)
> 
> For the future, please use get_maintainers.pl so that you know who to Cc
> on patches. In this particular case, patches touching arch/x86/ should
> Cc x86@kernel.org

Sure, thank you!

-jane

> 
> Thx.
> 


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions
@ 2022-03-22 23:48       ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-22 23:48 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	hch, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, linux-xfs, agk

On 3/22/2022 3:41 PM, Borislav Petkov wrote:
> On Sat, Mar 19, 2022 at 12:28:29AM -0600, Jane Chu wrote:
>> Relocate the twin mce functions to arch/x86/mm/pat/set_memory.c
>> file where they belong.
>>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> ---
>>   arch/x86/include/asm/set_memory.h | 52 -------------------------------
>>   arch/x86/mm/pat/set_memory.c      | 48 ++++++++++++++++++++++++++++
>>   include/linux/set_memory.h        |  9 +++---
>>   3 files changed, 53 insertions(+), 56 deletions(-)
> 
> For the future, please use get_maintainers.pl so that you know who to Cc
> on patches. In this particular case, patches touching arch/x86/ should
> Cc x86@kernel.org

Sure, thank you!

-jane

> 
> Thx.
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-22 23:05       ` [dm-devel] " Jane Chu
@ 2022-03-23  5:45         ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-23  5:45 UTC (permalink / raw)
  To: Jane Chu
  Cc: Christoph Hellwig, david, djwong, dan.j.williams, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs

On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote:
> > This DAX_RECOVERY doesn't actually seem to be used anywhere here or
> > in the subsequent patches.  Did I miss something?
> 
> dax_iomap_iter() uses the flag in the same patch,
> +               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
> +                       flags |= DAX_RECOVERY;
> +                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
> +                                               flags, &kaddr, NULL);

Yes, it passes it on to dax_direct_access, and dax_direct_access passes
it onto ->direct_access.  But nothing in this series actually checks
for it as far as I can tell.

> >> Also introduce a new dev_pagemap_ops .recovery_write function.
> >> The function is applicable to FSDAX device only. The device
> >> page backend driver provides .recovery_write function if the
> >> device has underlying mechanism to clear the uncorrectable
> >> errors on the fly.
> > 
> > Why is this not in struct dax_operations?
> 
> Per Dan's comments to the v5 series, adding .recovery_write to
> dax_operations causes a number of trivial dm targets changes.
> Dan suggested that adding .recovery_write to pagemap_ops could
> cut short the logistics of figuring out whether the driver backing
> up a page is indeed capable of clearing poison. Please see
> https://lkml.org/lkml/2022/2/4/31

But at least in this series there is  1:1 association between the
pgmap and the dax_device so that scheme won't work.   It would
have to lookup the pgmap based on the return physical address from
dax_direct_access.  Which sounds more complicated than just adding
the (annoying) boilerplate code to DM.

> include/linux/memremap.h doesn't know struct iov_iter which is defined 
> in include/linux/uio.h,  would you prefer to adding include/linux/uio.h 
> to include/linux/memremap.h ?

As it is not derefences just adding a

struct iov_iter;

line to memremap.h below the includes should be all that is needed.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-23  5:45         ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-23  5:45 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	Christoph Hellwig, dm-devel, vgoyal, vishal.l.verma,
	linux-fsdevel, dan.j.williams, ira.weiny, linux-xfs, agk

On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote:
> > This DAX_RECOVERY doesn't actually seem to be used anywhere here or
> > in the subsequent patches.  Did I miss something?
> 
> dax_iomap_iter() uses the flag in the same patch,
> +               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
> +                       flags |= DAX_RECOVERY;
> +                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
> +                                               flags, &kaddr, NULL);

Yes, it passes it on to dax_direct_access, and dax_direct_access passes
it onto ->direct_access.  But nothing in this series actually checks
for it as far as I can tell.

> >> Also introduce a new dev_pagemap_ops .recovery_write function.
> >> The function is applicable to FSDAX device only. The device
> >> page backend driver provides .recovery_write function if the
> >> device has underlying mechanism to clear the uncorrectable
> >> errors on the fly.
> > 
> > Why is this not in struct dax_operations?
> 
> Per Dan's comments to the v5 series, adding .recovery_write to
> dax_operations causes a number of trivial dm targets changes.
> Dan suggested that adding .recovery_write to pagemap_ops could
> cut short the logistics of figuring out whether the driver backing
> up a page is indeed capable of clearing poison. Please see
> https://lkml.org/lkml/2022/2/4/31

But at least in this series there is  1:1 association between the
pgmap and the dax_device so that scheme won't work.   It would
have to lookup the pgmap based on the return physical address from
dax_direct_access.  Which sounds more complicated than just adding
the (annoying) boilerplate code to DM.

> include/linux/memremap.h doesn't know struct iov_iter which is defined 
> in include/linux/uio.h,  would you prefer to adding include/linux/uio.h 
> to include/linux/memremap.h ?

As it is not derefences just adding a

struct iov_iter;

line to memremap.h below the includes should be all that is needed.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-23  5:45         ` [dm-devel] " Christoph Hellwig
@ 2022-03-23 18:43           ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-23 18:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: david, djwong, dan.j.williams, vishal.l.verma, dave.jiang, agk,
	snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 10:45 PM, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote:
>>> This DAX_RECOVERY doesn't actually seem to be used anywhere here or
>>> in the subsequent patches.  Did I miss something?
>>
>> dax_iomap_iter() uses the flag in the same patch,
>> +               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
>> +                       flags |= DAX_RECOVERY;
>> +                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
>> +                                               flags, &kaddr, NULL);
> 
> Yes, it passes it on to dax_direct_access, and dax_direct_access passes
> it onto ->direct_access.  But nothing in this series actually checks
> for it as far as I can tell.

The flag is checked here, again, I'll spell out the flag rather than 
using it as a boolean.

  __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
  {
  	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;

-	if (unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
+	if (!flags && unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
  					PFN_PHYS(nr_pages))))
  		return -EIO;

> 
>>>> Also introduce a new dev_pagemap_ops .recovery_write function.
>>>> The function is applicable to FSDAX device only. The device
>>>> page backend driver provides .recovery_write function if the
>>>> device has underlying mechanism to clear the uncorrectable
>>>> errors on the fly.
>>>
>>> Why is this not in struct dax_operations?
>>
>> Per Dan's comments to the v5 series, adding .recovery_write to
>> dax_operations causes a number of trivial dm targets changes.
>> Dan suggested that adding .recovery_write to pagemap_ops could
>> cut short the logistics of figuring out whether the driver backing
>> up a page is indeed capable of clearing poison. Please see
>> https://lkml.org/lkml/2022/2/4/31
> 
> But at least in this series there is  1:1 association between the
> pgmap and the dax_device so that scheme won't work.   It would
> have to lookup the pgmap based on the return physical address from
> dax_direct_access.  Which sounds more complicated than just adding
> the (annoying) boilerplate code to DM.
> 

Yes, good point!  Let me look into this.

>> include/linux/memremap.h doesn't know struct iov_iter which is defined
>> in include/linux/uio.h,  would you prefer to adding include/linux/uio.h
>> to include/linux/memremap.h ?
> 
> As it is not derefences just adding a
> 
> struct iov_iter;
> 
> line to memremap.h below the includes should be all that is needed.

Sure, will do.

Thanks!
-jane

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-23 18:43           ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-23 18:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	linux-xfs, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, agk

On 3/22/2022 10:45 PM, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote:
>>> This DAX_RECOVERY doesn't actually seem to be used anywhere here or
>>> in the subsequent patches.  Did I miss something?
>>
>> dax_iomap_iter() uses the flag in the same patch,
>> +               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
>> +                       flags |= DAX_RECOVERY;
>> +                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
>> +                                               flags, &kaddr, NULL);
> 
> Yes, it passes it on to dax_direct_access, and dax_direct_access passes
> it onto ->direct_access.  But nothing in this series actually checks
> for it as far as I can tell.

The flag is checked here, again, I'll spell out the flag rather than 
using it as a boolean.

  __weak long __pmem_direct_access(struct pmem_device *pmem, pgoff_t pgoff,
-		long nr_pages, void **kaddr, pfn_t *pfn)
+		long nr_pages, int flags, void **kaddr, pfn_t *pfn)
  {
  	resource_size_t offset = PFN_PHYS(pgoff) + pmem->data_offset;

-	if (unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
+	if (!flags && unlikely(is_bad_pmem(&pmem->bb, PFN_PHYS(pgoff) / 512,
  					PFN_PHYS(nr_pages))))
  		return -EIO;

> 
>>>> Also introduce a new dev_pagemap_ops .recovery_write function.
>>>> The function is applicable to FSDAX device only. The device
>>>> page backend driver provides .recovery_write function if the
>>>> device has underlying mechanism to clear the uncorrectable
>>>> errors on the fly.
>>>
>>> Why is this not in struct dax_operations?
>>
>> Per Dan's comments to the v5 series, adding .recovery_write to
>> dax_operations causes a number of trivial dm targets changes.
>> Dan suggested that adding .recovery_write to pagemap_ops could
>> cut short the logistics of figuring out whether the driver backing
>> up a page is indeed capable of clearing poison. Please see
>> https://lkml.org/lkml/2022/2/4/31
> 
> But at least in this series there is  1:1 association between the
> pgmap and the dax_device so that scheme won't work.   It would
> have to lookup the pgmap based on the return physical address from
> dax_direct_access.  Which sounds more complicated than just adding
> the (annoying) boilerplate code to DM.
> 

Yes, good point!  Let me look into this.

>> include/linux/memremap.h doesn't know struct iov_iter which is defined
>> in include/linux/uio.h,  would you prefer to adding include/linux/uio.h
>> to include/linux/memremap.h ?
> 
> As it is not derefences just adding a
> 
> struct iov_iter;
> 
> line to memremap.h below the includes should be all that is needed.

Sure, will do.

Thanks!
-jane
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-23 18:43           ` [dm-devel] " Jane Chu
@ 2022-03-24  6:37             ` Christoph Hellwig
  -1 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-24  6:37 UTC (permalink / raw)
  To: Jane Chu
  Cc: Christoph Hellwig, david, djwong, dan.j.williams, vishal.l.verma,
	dave.jiang, agk, snitzer, dm-devel, ira.weiny, willy, vgoyal,
	linux-fsdevel, nvdimm, linux-kernel, linux-xfs

On Wed, Mar 23, 2022 at 06:43:06PM +0000, Jane Chu wrote:
> > it onto ->direct_access.  But nothing in this series actually checks
> > for it as far as I can tell.
> 
> The flag is checked here, again, I'll spell out the flag rather than 
> using it as a boolean.

Yes.  The whole point of a flags value is that it is extensible..

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-24  6:37             ` Christoph Hellwig
  0 siblings, 0 replies; 52+ messages in thread
From: Christoph Hellwig @ 2022-03-24  6:37 UTC (permalink / raw)
  To: Jane Chu
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	Christoph Hellwig, dm-devel, vgoyal, vishal.l.verma,
	linux-fsdevel, dan.j.williams, ira.weiny, linux-xfs, agk

On Wed, Mar 23, 2022 at 06:43:06PM +0000, Jane Chu wrote:
> > it onto ->direct_access.  But nothing in this series actually checks
> > for it as far as I can tell.
> 
> The flag is checked here, again, I'll spell out the flag rather than 
> using it as a boolean.

Yes.  The whole point of a flags value is that it is extensible..

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
  2022-03-23  5:45         ` [dm-devel] " Christoph Hellwig
@ 2022-03-26  6:31           ` Jane Chu
  -1 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-26  6:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: david, djwong, dan.j.williams, vishal.l.verma, dave.jiang, agk,
	snitzer, dm-devel, ira.weiny, willy, vgoyal, linux-fsdevel,
	nvdimm, linux-kernel, linux-xfs

On 3/22/2022 10:45 PM, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote:
>>> This DAX_RECOVERY doesn't actually seem to be used anywhere here or
>>> in the subsequent patches.  Did I miss something?
>>
>> dax_iomap_iter() uses the flag in the same patch,
>> +               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
>> +                       flags |= DAX_RECOVERY;
>> +                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
>> +                                               flags, &kaddr, NULL);
> 
> Yes, it passes it on to dax_direct_access, and dax_direct_access passes
> it onto ->direct_access.  But nothing in this series actually checks
> for it as far as I can tell.
> 
>>>> Also introduce a new dev_pagemap_ops .recovery_write function.
>>>> The function is applicable to FSDAX device only. The device
>>>> page backend driver provides .recovery_write function if the
>>>> device has underlying mechanism to clear the uncorrectable
>>>> errors on the fly.
>>>
>>> Why is this not in struct dax_operations?
>>
>> Per Dan's comments to the v5 series, adding .recovery_write to
>> dax_operations causes a number of trivial dm targets changes.
>> Dan suggested that adding .recovery_write to pagemap_ops could
>> cut short the logistics of figuring out whether the driver backing
>> up a page is indeed capable of clearing poison. Please see
>> https://lkml.org/lkml/2022/2/4/31
> 
> But at least in this series there is  1:1 association between the
> pgmap and the dax_device so that scheme won't work.   It would
> have to lookup the pgmap based on the return physical address from
> dax_direct_access.  Which sounds more complicated than just adding
> the (annoying) boilerplate code to DM.
> 

Getting dax_direct_access to return pfn seems straight forward,
what do you think of below change?

--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -195,10 +195,10 @@ int dax_zero_page_range(struct dax_device 
*dax_dev, pgoff_t pgoff,
  }
  EXPORT_SYMBOL_GPL(dax_zero_page_range);

-size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
+size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, 
pfn_t pfn,
                 void *addr, size_t bytes, struct iov_iter *iter)
  {
-       struct dev_pagemap *pgmap = dax_dev->pgmap;
+       struct dev_pagemap *pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), 
NULL);

         if (!pgmap || !pgmap->ops->recovery_write)
                 return -EIO;


--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1243,6 +1243,7 @@ static loff_t dax_iomap_iter(const struct 
iomap_iter *iomi,
                 int flags, recov;
                 void *kaddr;
                 long nrpg;
+               pfn_t pfn;

                 if (fatal_signal_pending(current)) {
                         ret = -EINTR;
@@ -1257,7 +1258,7 @@ static loff_t dax_iomap_iter(const struct 
iomap_iter *iomi,
                 if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
                         flags |= DAX_RECOVERY;
                         map_len = dax_direct_access(dax_dev, pgoff, nrpg,
-                                               flags, &kaddr, NULL);
+                                               flags, &kaddr, &pfn);
                         if (map_len > 0)
                                 recov++;
                 }
@@ -1273,7 +1274,7 @@ static loff_t dax_iomap_iter(const struct 
iomap_iter *iomi,
                         map_len = end - pos;

                 if (recov)
-                       xfer = dax_recovery_write(dax_dev, pgoff, kaddr,
+                       xfer = dax_recovery_write(dax_dev, pgoff, pfn, 
kaddr,
                                         map_len, iter);
                 else if (iov_iter_rw(iter) == WRITE)
                         xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,


thanks!
-jane

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [dm-devel] [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops
@ 2022-03-26  6:31           ` Jane Chu
  0 siblings, 0 replies; 52+ messages in thread
From: Jane Chu @ 2022-03-26  6:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: nvdimm, dave.jiang, snitzer, djwong, david, linux-kernel, willy,
	linux-xfs, dm-devel, vgoyal, vishal.l.verma, linux-fsdevel,
	dan.j.williams, ira.weiny, agk

On 3/22/2022 10:45 PM, Christoph Hellwig wrote:
> On Tue, Mar 22, 2022 at 11:05:09PM +0000, Jane Chu wrote:
>>> This DAX_RECOVERY doesn't actually seem to be used anywhere here or
>>> in the subsequent patches.  Did I miss something?
>>
>> dax_iomap_iter() uses the flag in the same patch,
>> +               if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
>> +                       flags |= DAX_RECOVERY;
>> +                       map_len = dax_direct_access(dax_dev, pgoff, nrpg,
>> +                                               flags, &kaddr, NULL);
> 
> Yes, it passes it on to dax_direct_access, and dax_direct_access passes
> it onto ->direct_access.  But nothing in this series actually checks
> for it as far as I can tell.
> 
>>>> Also introduce a new dev_pagemap_ops .recovery_write function.
>>>> The function is applicable to FSDAX device only. The device
>>>> page backend driver provides .recovery_write function if the
>>>> device has underlying mechanism to clear the uncorrectable
>>>> errors on the fly.
>>>
>>> Why is this not in struct dax_operations?
>>
>> Per Dan's comments to the v5 series, adding .recovery_write to
>> dax_operations causes a number of trivial dm targets changes.
>> Dan suggested that adding .recovery_write to pagemap_ops could
>> cut short the logistics of figuring out whether the driver backing
>> up a page is indeed capable of clearing poison. Please see
>> https://lkml.org/lkml/2022/2/4/31
> 
> But at least in this series there is  1:1 association between the
> pgmap and the dax_device so that scheme won't work.   It would
> have to lookup the pgmap based on the return physical address from
> dax_direct_access.  Which sounds more complicated than just adding
> the (annoying) boilerplate code to DM.
> 

Getting dax_direct_access to return pfn seems straight forward,
what do you think of below change?

--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -195,10 +195,10 @@ int dax_zero_page_range(struct dax_device 
*dax_dev, pgoff_t pgoff,
  }
  EXPORT_SYMBOL_GPL(dax_zero_page_range);

-size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff,
+size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, 
pfn_t pfn,
                 void *addr, size_t bytes, struct iov_iter *iter)
  {
-       struct dev_pagemap *pgmap = dax_dev->pgmap;
+       struct dev_pagemap *pgmap = get_dev_pagemap(pfn_t_to_pfn(pfn), 
NULL);

         if (!pgmap || !pgmap->ops->recovery_write)
                 return -EIO;


--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1243,6 +1243,7 @@ static loff_t dax_iomap_iter(const struct 
iomap_iter *iomi,
                 int flags, recov;
                 void *kaddr;
                 long nrpg;
+               pfn_t pfn;

                 if (fatal_signal_pending(current)) {
                         ret = -EINTR;
@@ -1257,7 +1258,7 @@ static loff_t dax_iomap_iter(const struct 
iomap_iter *iomi,
                 if ((map_len == -EIO) && (iov_iter_rw(iter) == WRITE)) {
                         flags |= DAX_RECOVERY;
                         map_len = dax_direct_access(dax_dev, pgoff, nrpg,
-                                               flags, &kaddr, NULL);
+                                               flags, &kaddr, &pfn);
                         if (map_len > 0)
                                 recov++;
                 }
@@ -1273,7 +1274,7 @@ static loff_t dax_iomap_iter(const struct 
iomap_iter *iomi,
                         map_len = end - pos;

                 if (recov)
-                       xfer = dax_recovery_write(dax_dev, pgoff, kaddr,
+                       xfer = dax_recovery_write(dax_dev, pgoff, pfn, 
kaddr,
                                         map_len, iter);
                 else if (iov_iter_rw(iter) == WRITE)
                         xfer = dax_copy_from_iter(dax_dev, pgoff, kaddr,


thanks!
-jane
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2022-03-26  6:32 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-19  6:28 [PATCH v6 0/6] DAX poison recovery Jane Chu
2022-03-19  6:28 ` [dm-devel] " Jane Chu
2022-03-19  6:28 ` [PATCH v6 1/6] x86/mm: fix comment Jane Chu
2022-03-19  6:28   ` [dm-devel] " Jane Chu
2022-03-22  8:40   ` Christoph Hellwig
2022-03-22  8:40     ` [dm-devel] " Christoph Hellwig
2022-03-19  6:28 ` [PATCH v6 2/6] x86/mce: relocate set{clear}_mce_nospec() functions Jane Chu
2022-03-19  6:28   ` [dm-devel] " Jane Chu
2022-03-19  8:13   ` kernel test robot
2022-03-19  8:13     ` [dm-devel] " kernel test robot
2022-03-19  8:24   ` kernel test robot
2022-03-19  8:24     ` [dm-devel] " kernel test robot
2022-03-22  8:42   ` Christoph Hellwig
2022-03-22  8:42     ` [dm-devel] " Christoph Hellwig
2022-03-22 22:19     ` Jane Chu
2022-03-22 22:19       ` [dm-devel] " Jane Chu
2022-03-22 22:41   ` Borislav Petkov
2022-03-22 22:41     ` [dm-devel] " Borislav Petkov
2022-03-22 23:48     ` Jane Chu
2022-03-22 23:48       ` [dm-devel] " Jane Chu
2022-03-19  6:28 ` [PATCH v6 3/6] mce: fix set_mce_nospec to always unmap the whole page Jane Chu
2022-03-19  6:28   ` [dm-devel] " Jane Chu
2022-03-22  8:44   ` Christoph Hellwig
2022-03-22  8:44     ` [dm-devel] " Christoph Hellwig
2022-03-22 22:45     ` Jane Chu
2022-03-22 22:45       ` [dm-devel] " Jane Chu
2022-03-19  6:28 ` [PATCH v6 4/6] dax: add DAX_RECOVERY flag and .recovery_write dev_pgmap_ops Jane Chu
2022-03-19  6:28   ` [dm-devel] " Jane Chu
2022-03-19  8:24   ` kernel test robot
2022-03-19  8:24     ` [dm-devel] " kernel test robot
2022-03-19  8:44   ` kernel test robot
2022-03-19  8:44     ` [dm-devel] " kernel test robot
2022-03-22  9:01   ` Christoph Hellwig
2022-03-22  9:01     ` [dm-devel] " Christoph Hellwig
2022-03-22 23:05     ` Jane Chu
2022-03-22 23:05       ` [dm-devel] " Jane Chu
2022-03-23  5:45       ` Christoph Hellwig
2022-03-23  5:45         ` [dm-devel] " Christoph Hellwig
2022-03-23 18:43         ` Jane Chu
2022-03-23 18:43           ` [dm-devel] " Jane Chu
2022-03-24  6:37           ` Christoph Hellwig
2022-03-24  6:37             ` [dm-devel] " Christoph Hellwig
2022-03-26  6:31         ` Jane Chu
2022-03-26  6:31           ` [dm-devel] " Jane Chu
2022-03-19  6:28 ` [PATCH v6 5/6] pmem: refactor pmem_clear_poison() Jane Chu
2022-03-19  6:28   ` [dm-devel] " Jane Chu
2022-03-22  8:53   ` Christoph Hellwig
2022-03-22  8:53     ` [dm-devel] " Christoph Hellwig
2022-03-22 23:45     ` Jane Chu
2022-03-22 23:45       ` [dm-devel] " Jane Chu
2022-03-19  6:28 ` [PATCH v6 6/6] pmem: implement pmem_recovery_write() Jane Chu
2022-03-19  6:28   ` [dm-devel] " Jane Chu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.