All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] misc updates for Address Range Scrub
@ 2016-09-30 20:10 ` Vishal Verma
  0 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Rafael J. Wysocki, linux-acpi

Changes in v2:
- Change the 'scrub' attribute to only show the number of completed scrubs,
  and start a new one ondemand. (Dan)
- Add a new attribute 'hw_error_scrub' which controls whether or not a full
  scrub will run on hardware memory errors. (Dan)

Patch 1 changes the default behaviour on machine check exceptions to
just adding the error address to badblocks accounting instead of starting
a full ARS. The old behaviour can be enabled via sysfs.

Patch 2 and 3 fix a problem where stale badblocks could show up after an
on-demand ARS or an MCE triggered scrub because when clearing poison, we
didn't clear the internal nvdimm_bus->poison_list.

Vishal Verma (3):
  nfit: don't start a full scrub by default for an MCE
  pmem: reduce kmap_atomic sections to the memcpys only
  libnvdimm: clear the internal poison_list when clearing badblocks

 drivers/acpi/nfit/core.c  | 54 +++++++++++++++++++++++++++++++++++
 drivers/acpi/nfit/mce.c   | 24 ++++++++++++----
 drivers/acpi/nfit/nfit.h  |  6 ++++
 drivers/nvdimm/bus.c      |  2 ++
 drivers/nvdimm/core.c     | 73 ++++++++++++++++++++++++++++++++++++++++++++---
 drivers/nvdimm/pmem.c     | 28 ++++++++++++++----
 include/linux/libnvdimm.h |  2 ++
 7 files changed, 175 insertions(+), 14 deletions(-)

-- 
2.7.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 0/3] misc updates for Address Range Scrub
@ 2016-09-30 20:10 ` Vishal Verma
  0 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw
  Cc: Rafael J. Wysocki, linux-acpi-u79uwXL29TY76Z2rM5mHXA

Changes in v2:
- Change the 'scrub' attribute to only show the number of completed scrubs,
  and start a new one ondemand. (Dan)
- Add a new attribute 'hw_error_scrub' which controls whether or not a full
  scrub will run on hardware memory errors. (Dan)

Patch 1 changes the default behaviour on machine check exceptions to
just adding the error address to badblocks accounting instead of starting
a full ARS. The old behaviour can be enabled via sysfs.

Patch 2 and 3 fix a problem where stale badblocks could show up after an
on-demand ARS or an MCE triggered scrub because when clearing poison, we
didn't clear the internal nvdimm_bus->poison_list.

Vishal Verma (3):
  nfit: don't start a full scrub by default for an MCE
  pmem: reduce kmap_atomic sections to the memcpys only
  libnvdimm: clear the internal poison_list when clearing badblocks

 drivers/acpi/nfit/core.c  | 54 +++++++++++++++++++++++++++++++++++
 drivers/acpi/nfit/mce.c   | 24 ++++++++++++----
 drivers/acpi/nfit/nfit.h  |  6 ++++
 drivers/nvdimm/bus.c      |  2 ++
 drivers/nvdimm/core.c     | 73 ++++++++++++++++++++++++++++++++++++++++++++---
 drivers/nvdimm/pmem.c     | 28 ++++++++++++++----
 include/linux/libnvdimm.h |  2 ++
 7 files changed, 175 insertions(+), 14 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 1/3] nfit: don't start a full scrub by default for an MCE
  2016-09-30 20:10 ` Vishal Verma
@ 2016-09-30 20:10   ` Vishal Verma
  -1 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Rafael J. Wysocki, linux-acpi

Starting a full Address Range Scrub (ARS) on hitting a memory error
machine check exception may not always be desirable. Provide a way
through sysfs to toggle the behavior between just adding the address
(cache line) where the MCE happened to the poison list and doing a full
scrub. The former (selective insertion of the address) is done
unconditionally.

Cc: linux-acpi@vger.kernel.org
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 drivers/acpi/nfit/core.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/acpi/nfit/mce.c  | 24 ++++++++++++++++-----
 drivers/acpi/nfit/nfit.h |  6 ++++++
 3 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 80cc7c0..6a12bc2 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -878,6 +878,58 @@ static ssize_t revision_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(revision);
 
+static ssize_t hw_error_scrub_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev);
+	struct nvdimm_bus_descriptor *nd_desc = to_nd_desc(nvdimm_bus);
+	struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
+
+	return sprintf(buf, "%d\n", acpi_desc->scrub_mode);
+}
+
+/*
+ * The 'hw_error_scrub' attribute can have the following values written to it:
+ * '1': Enable a full scrub to happen if an exception for a memory error is
+ *      received.
+ * '2': Switch to the default mode where an exception will only insert
+ *      the address of the memory error into the poison and badblocks lists.
+ */
+static ssize_t hw_error_scrub_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t size)
+{
+	struct nvdimm_bus_descriptor *nd_desc;
+	ssize_t rc;
+	long val;
+
+	rc = kstrtol(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	device_lock(dev);
+	nd_desc = dev_get_drvdata(dev);
+	if (nd_desc) {
+		struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
+
+		switch (val) {
+		case MCE_SCRUB_ON:
+			acpi_desc->scrub_mode = MCE_SCRUB_ON;
+			break;
+		case MCE_SCRUB_OFF:
+			acpi_desc->scrub_mode = MCE_SCRUB_OFF;
+			break;
+		default:
+			rc = -EINVAL;
+			break;
+		}
+	}
+	device_unlock(dev);
+	if (rc)
+		return rc;
+	return size;
+}
+static DEVICE_ATTR_RW(hw_error_scrub);
+
 /*
  * This shows the number of full Address Range Scrubs that have been
  * completed since driver load time. Userspace can wait on this using
@@ -950,6 +1002,7 @@ static umode_t nfit_visible(struct kobject *kobj, struct attribute *a, int n)
 static struct attribute *acpi_nfit_attributes[] = {
 	&dev_attr_revision.attr,
 	&dev_attr_scrub.attr,
+	&dev_attr_hw_error_scrub.attr,
 	NULL,
 };
 
@@ -2489,6 +2542,7 @@ int acpi_nfit_init(struct acpi_nfit_desc *acpi_desc, void *data, acpi_size sz)
 		goto out_unlock;
 	}
 
+	acpi_desc->scrub_mode = MCE_SCRUB_OFF;
 	rc = acpi_nfit_check_deletions(acpi_desc, &prev);
 	if (rc)
 		goto out_unlock;
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index 161f915..05005e2 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -14,6 +14,7 @@
  */
 #include <linux/notifier.h>
 #include <linux/acpi.h>
+#include <linux/nd.h>
 #include <asm/mce.h>
 #include "nfit.h"
 
@@ -62,12 +63,25 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
 		}
 		mutex_unlock(&acpi_desc->init_mutex);
 
-		/*
-		 * We can ignore an -EBUSY here because if an ARS is already
-		 * in progress, just let that be the last authoritative one
-		 */
-		if (found_match)
+		if (!found_match)
+			continue;
+
+		/* If this fails due to an -ENOMEM, there is little we can do */
+		nvdimm_bus_add_poison(acpi_desc->nvdimm_bus,
+				ALIGN(mce->addr, L1_CACHE_BYTES),
+				L1_CACHE_BYTES);
+		nvdimm_region_notify(nfit_spa->nd_region,
+				NVDIMM_REVALIDATE_POISON);
+
+		if (acpi_desc->scrub_mode == MCE_SCRUB_ON) {
+			/*
+			 * We can ignore an -EBUSY here because if an ARS is
+			 * already in progress, just let that be the last
+			 * authoritative one
+			 */
 			acpi_nfit_ars_rescan(acpi_desc);
+		}
+		break;
 	}
 
 	mutex_unlock(&acpi_desc_lock);
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index e894ded..487dc0a 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -152,6 +152,7 @@ struct acpi_nfit_desc {
 	struct list_head list;
 	struct kernfs_node *scrub_count_state;
 	unsigned int scrub_count;
+	unsigned int scrub_mode;
 	unsigned int cancel:1;
 	unsigned long dimm_cmd_force_en;
 	unsigned long bus_cmd_force_en;
@@ -159,6 +160,11 @@ struct acpi_nfit_desc {
 			void *iobuf, u64 len, int rw);
 };
 
+enum scrub_mode {
+	MCE_SCRUB_ON = 1,
+	MCE_SCRUB_OFF = 2,
+};
+
 enum nd_blk_mmio_selector {
 	BDW,
 	DCR,
-- 
2.7.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 1/3] nfit: don't start a full scrub by default for an MCE
@ 2016-09-30 20:10   ` Vishal Verma
  0 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Dan Williams, Ross Zwisler, Linda Knippers, linux-acpi,
	Rafael J. Wysocki, Vishal Verma

Starting a full Address Range Scrub (ARS) on hitting a memory error
machine check exception may not always be desirable. Provide a way
through sysfs to toggle the behavior between just adding the address
(cache line) where the MCE happened to the poison list and doing a full
scrub. The former (selective insertion of the address) is done
unconditionally.

Cc: linux-acpi@vger.kernel.org
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 drivers/acpi/nfit/core.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/acpi/nfit/mce.c  | 24 ++++++++++++++++-----
 drivers/acpi/nfit/nfit.h |  6 ++++++
 3 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 80cc7c0..6a12bc2 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -878,6 +878,58 @@ static ssize_t revision_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(revision);
 
+static ssize_t hw_error_scrub_show(struct device *dev,
+		struct device_attribute *attr, char *buf)
+{
+	struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev);
+	struct nvdimm_bus_descriptor *nd_desc = to_nd_desc(nvdimm_bus);
+	struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
+
+	return sprintf(buf, "%d\n", acpi_desc->scrub_mode);
+}
+
+/*
+ * The 'hw_error_scrub' attribute can have the following values written to it:
+ * '1': Enable a full scrub to happen if an exception for a memory error is
+ *      received.
+ * '2': Switch to the default mode where an exception will only insert
+ *      the address of the memory error into the poison and badblocks lists.
+ */
+static ssize_t hw_error_scrub_store(struct device *dev,
+		struct device_attribute *attr, const char *buf, size_t size)
+{
+	struct nvdimm_bus_descriptor *nd_desc;
+	ssize_t rc;
+	long val;
+
+	rc = kstrtol(buf, 0, &val);
+	if (rc)
+		return rc;
+
+	device_lock(dev);
+	nd_desc = dev_get_drvdata(dev);
+	if (nd_desc) {
+		struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
+
+		switch (val) {
+		case MCE_SCRUB_ON:
+			acpi_desc->scrub_mode = MCE_SCRUB_ON;
+			break;
+		case MCE_SCRUB_OFF:
+			acpi_desc->scrub_mode = MCE_SCRUB_OFF;
+			break;
+		default:
+			rc = -EINVAL;
+			break;
+		}
+	}
+	device_unlock(dev);
+	if (rc)
+		return rc;
+	return size;
+}
+static DEVICE_ATTR_RW(hw_error_scrub);
+
 /*
  * This shows the number of full Address Range Scrubs that have been
  * completed since driver load time. Userspace can wait on this using
@@ -950,6 +1002,7 @@ static umode_t nfit_visible(struct kobject *kobj, struct attribute *a, int n)
 static struct attribute *acpi_nfit_attributes[] = {
 	&dev_attr_revision.attr,
 	&dev_attr_scrub.attr,
+	&dev_attr_hw_error_scrub.attr,
 	NULL,
 };
 
@@ -2489,6 +2542,7 @@ int acpi_nfit_init(struct acpi_nfit_desc *acpi_desc, void *data, acpi_size sz)
 		goto out_unlock;
 	}
 
+	acpi_desc->scrub_mode = MCE_SCRUB_OFF;
 	rc = acpi_nfit_check_deletions(acpi_desc, &prev);
 	if (rc)
 		goto out_unlock;
diff --git a/drivers/acpi/nfit/mce.c b/drivers/acpi/nfit/mce.c
index 161f915..05005e2 100644
--- a/drivers/acpi/nfit/mce.c
+++ b/drivers/acpi/nfit/mce.c
@@ -14,6 +14,7 @@
  */
 #include <linux/notifier.h>
 #include <linux/acpi.h>
+#include <linux/nd.h>
 #include <asm/mce.h>
 #include "nfit.h"
 
@@ -62,12 +63,25 @@ static int nfit_handle_mce(struct notifier_block *nb, unsigned long val,
 		}
 		mutex_unlock(&acpi_desc->init_mutex);
 
-		/*
-		 * We can ignore an -EBUSY here because if an ARS is already
-		 * in progress, just let that be the last authoritative one
-		 */
-		if (found_match)
+		if (!found_match)
+			continue;
+
+		/* If this fails due to an -ENOMEM, there is little we can do */
+		nvdimm_bus_add_poison(acpi_desc->nvdimm_bus,
+				ALIGN(mce->addr, L1_CACHE_BYTES),
+				L1_CACHE_BYTES);
+		nvdimm_region_notify(nfit_spa->nd_region,
+				NVDIMM_REVALIDATE_POISON);
+
+		if (acpi_desc->scrub_mode == MCE_SCRUB_ON) {
+			/*
+			 * We can ignore an -EBUSY here because if an ARS is
+			 * already in progress, just let that be the last
+			 * authoritative one
+			 */
 			acpi_nfit_ars_rescan(acpi_desc);
+		}
+		break;
 	}
 
 	mutex_unlock(&acpi_desc_lock);
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index e894ded..487dc0a 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -152,6 +152,7 @@ struct acpi_nfit_desc {
 	struct list_head list;
 	struct kernfs_node *scrub_count_state;
 	unsigned int scrub_count;
+	unsigned int scrub_mode;
 	unsigned int cancel:1;
 	unsigned long dimm_cmd_force_en;
 	unsigned long bus_cmd_force_en;
@@ -159,6 +160,11 @@ struct acpi_nfit_desc {
 			void *iobuf, u64 len, int rw);
 };
 
+enum scrub_mode {
+	MCE_SCRUB_ON = 1,
+	MCE_SCRUB_OFF = 2,
+};
+
 enum nd_blk_mmio_selector {
 	BDW,
 	DCR,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/3] pmem: reduce kmap_atomic sections to the memcpys only
@ 2016-09-30 20:10   ` Vishal Verma
  0 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Rafael J. Wysocki, linux-acpi

pmem_do_bvec used to kmap_atomic at the begin, and only unmap at the
end. Things like nvdimm_clear_poison may want to do nvdimm subsystem
bookkeeping operations that may involve taking locks or doing memory
allocations, and we can't do that from the atomic context. Reduce the
atomic context to just what needs it - the memcpy to/from pmem.

Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 drivers/nvdimm/pmem.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 571a6c7..ca038d8 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -66,13 +66,32 @@ static void pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset,
 	invalidate_pmem(pmem->virt_addr + offset, len);
 }
 
+static void map_and_copy_to_pmem(void *pmem_addr, struct page *page,
+		unsigned int off, unsigned int len)
+{
+	void *mem = kmap_atomic(page);
+
+	memcpy_to_pmem(pmem_addr, mem + off, len);
+	kunmap_atomic(mem);
+}
+
+static int map_and_copy_from_pmem(struct page *page, unsigned int off,
+		void *pmem_addr, unsigned int len)
+{
+	int rc;
+	void *mem = kmap_atomic(page);
+
+	rc = memcpy_from_pmem(mem + off, pmem_addr, len);
+	kunmap_atomic(mem);
+	return rc;
+}
+
 static int pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 			unsigned int len, unsigned int off, bool is_write,
 			sector_t sector)
 {
 	int rc = 0;
 	bool bad_pmem = false;
-	void *mem = kmap_atomic(page);
 	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
@@ -83,7 +102,7 @@ static int pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 		if (unlikely(bad_pmem))
 			rc = -EIO;
 		else {
-			rc = memcpy_from_pmem(mem + off, pmem_addr, len);
+			rc = map_and_copy_from_pmem(page, off, pmem_addr, len);
 			flush_dcache_page(page);
 		}
 	} else {
@@ -102,14 +121,13 @@ static int pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 		 * after clear poison.
 		 */
 		flush_dcache_page(page);
-		memcpy_to_pmem(pmem_addr, mem + off, len);
+		map_and_copy_to_pmem(pmem_addr, page, off, len);
 		if (unlikely(bad_pmem)) {
 			pmem_clear_poison(pmem, pmem_off, len);
-			memcpy_to_pmem(pmem_addr, mem + off, len);
+			map_and_copy_to_pmem(pmem_addr, page, off, len);
 		}
 	}
 
-	kunmap_atomic(mem);
 	return rc;
 }
 
-- 
2.7.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 2/3] pmem: reduce kmap_atomic sections to the memcpys only
@ 2016-09-30 20:10   ` Vishal Verma
  0 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw
  Cc: Rafael J. Wysocki, linux-acpi-u79uwXL29TY76Z2rM5mHXA

pmem_do_bvec used to kmap_atomic at the begin, and only unmap at the
end. Things like nvdimm_clear_poison may want to do nvdimm subsystem
bookkeeping operations that may involve taking locks or doing memory
allocations, and we can't do that from the atomic context. Reduce the
atomic context to just what needs it - the memcpy to/from pmem.

Cc: Ross Zwisler <ross.zwisler-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Signed-off-by: Vishal Verma <vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/nvdimm/pmem.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 571a6c7..ca038d8 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -66,13 +66,32 @@ static void pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset,
 	invalidate_pmem(pmem->virt_addr + offset, len);
 }
 
+static void map_and_copy_to_pmem(void *pmem_addr, struct page *page,
+		unsigned int off, unsigned int len)
+{
+	void *mem = kmap_atomic(page);
+
+	memcpy_to_pmem(pmem_addr, mem + off, len);
+	kunmap_atomic(mem);
+}
+
+static int map_and_copy_from_pmem(struct page *page, unsigned int off,
+		void *pmem_addr, unsigned int len)
+{
+	int rc;
+	void *mem = kmap_atomic(page);
+
+	rc = memcpy_from_pmem(mem + off, pmem_addr, len);
+	kunmap_atomic(mem);
+	return rc;
+}
+
 static int pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 			unsigned int len, unsigned int off, bool is_write,
 			sector_t sector)
 {
 	int rc = 0;
 	bool bad_pmem = false;
-	void *mem = kmap_atomic(page);
 	phys_addr_t pmem_off = sector * 512 + pmem->data_offset;
 	void *pmem_addr = pmem->virt_addr + pmem_off;
 
@@ -83,7 +102,7 @@ static int pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 		if (unlikely(bad_pmem))
 			rc = -EIO;
 		else {
-			rc = memcpy_from_pmem(mem + off, pmem_addr, len);
+			rc = map_and_copy_from_pmem(page, off, pmem_addr, len);
 			flush_dcache_page(page);
 		}
 	} else {
@@ -102,14 +121,13 @@ static int pmem_do_bvec(struct pmem_device *pmem, struct page *page,
 		 * after clear poison.
 		 */
 		flush_dcache_page(page);
-		memcpy_to_pmem(pmem_addr, mem + off, len);
+		map_and_copy_to_pmem(pmem_addr, page, off, len);
 		if (unlikely(bad_pmem)) {
 			pmem_clear_poison(pmem, pmem_off, len);
-			memcpy_to_pmem(pmem_addr, mem + off, len);
+			map_and_copy_to_pmem(pmem_addr, page, off, len);
 		}
 	}
 
-	kunmap_atomic(mem);
 	return rc;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/3] libnvdimm: clear the internal poison_list when clearing badblocks
  2016-09-30 20:10 ` Vishal Verma
@ 2016-09-30 20:10   ` Vishal Verma
  -1 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Rafael J. Wysocki, linux-acpi

nvdimm_clear_poison cleared the user-visible badblocks, and sent
commands to the NVDIMM to clear the areas marked as 'poison', but it
neglected to clear the same areas from the internal poison_list which is
used to marshal ARS results before sorting them by namespace. As a
result, once on-demand ARS functionality was added:

37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand

A scrub triggered from either sysfs or an MCE was found to be adding
stale entries that had been cleared from gendisk->badblocks, but were
still present in nvdimm_bus->poison_list.

This adds the missing step of clearing poison_list entries when clearing
poison, so that it is in sync with badblocks.

Fixes: 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 drivers/nvdimm/bus.c      |  2 ++
 drivers/nvdimm/core.c     | 73 ++++++++++++++++++++++++++++++++++++++++++++---
 include/linux/libnvdimm.h |  2 ++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 935866f..a8b6949 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -217,6 +217,8 @@ long nvdimm_clear_poison(struct device *dev, phys_addr_t phys,
 		return rc;
 	if (cmd_rc < 0)
 		return cmd_rc;
+
+	nvdimm_clear_from_poison_list(nvdimm_bus, phys, len);
 	return clear_err.cleared;
 }
 EXPORT_SYMBOL_GPL(nvdimm_clear_poison);
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 715583f..42e40db 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -541,11 +541,12 @@ void nvdimm_badblocks_populate(struct nd_region *nd_region,
 }
 EXPORT_SYMBOL_GPL(nvdimm_badblocks_populate);
 
-static int add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
+static int add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length,
+			gfp_t flags)
 {
 	struct nd_poison *pl;
 
-	pl = kzalloc(sizeof(*pl), GFP_KERNEL);
+	pl = kzalloc(sizeof(*pl), flags);
 	if (!pl)
 		return -ENOMEM;
 
@@ -561,7 +562,7 @@ static int bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
 	struct nd_poison *pl;
 
 	if (list_empty(&nvdimm_bus->poison_list))
-		return add_poison(nvdimm_bus, addr, length);
+		return add_poison(nvdimm_bus, addr, length, GFP_KERNEL);
 
 	/*
 	 * There is a chance this is a duplicate, check for those first.
@@ -581,7 +582,7 @@ static int bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
 	 * as any overlapping ranges will get resolved when the list is consumed
 	 * and converted to badblocks
 	 */
-	return add_poison(nvdimm_bus, addr, length);
+	return add_poison(nvdimm_bus, addr, length, GFP_KERNEL);
 }
 
 int nvdimm_bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
@@ -596,6 +597,70 @@ int nvdimm_bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
 }
 EXPORT_SYMBOL_GPL(nvdimm_bus_add_poison);
 
+void nvdimm_clear_from_poison_list(struct nvdimm_bus *nvdimm_bus,
+		phys_addr_t start, unsigned int len)
+{
+	struct list_head *poison_list = &nvdimm_bus->poison_list;
+	u64 clr_end = start + len - 1;
+	struct nd_poison *pl, *next;
+
+	nvdimm_bus_lock(&nvdimm_bus->dev);
+	WARN_ON_ONCE(list_empty(poison_list));
+
+	/*
+	 * [start, clr_end] is the poison interval being cleared.
+	 * [pl->start, pl_end] is the poison_list entry we're comparing
+	 * the above interval against. The poison list entry may need
+	 * to be modified (update either start or length), deleted, or
+	 * split into two based on the overlap characteristics
+	 */
+
+	list_for_each_entry_safe(pl, next, poison_list, list) {
+		u64 pl_end = pl->start + pl->length - 1;
+
+		/* Skip intervals with no intersection */
+		if (pl_end < start)
+			continue;
+		if (pl->start >  clr_end)
+			continue;
+		/* Delete completely overlapped poison entries */
+		if ((pl->start >= start) && (pl_end <= clr_end)) {
+			list_del(&pl->list);
+			kfree(pl);
+			continue;
+		}
+		/* Adjust start point of partially cleared entries */
+		if ((start <= pl->start) && (clr_end > pl->start)) {
+			pl->length -= clr_end - pl->start + 1;
+			pl->start = clr_end + 1;
+			continue;
+		}
+		/* Adjust pl->length for partial clearing at the tail end */
+		if ((pl->start < start) && (pl_end <= clr_end)) {
+			/* pl->start remains the same */
+			pl->length = start - pl->start;
+			continue;
+		}
+		/*
+		 * If clearing in the middle of an entry, we split it into
+		 * two by modifying the current entry to represent one half of
+		 * the split, and adding a new entry for the second half.
+		 */
+		if ((pl->start < start) && (pl_end > clr_end)) {
+			u64 new_start = clr_end + 1;
+			u64 new_len = pl_end - new_start + 1;
+
+			/* Add new entry covering the right half */
+			add_poison(nvdimm_bus, new_start, new_len, GFP_NOIO);
+			/* Adjust this entry to cover the left half */
+			pl->length = start - pl->start;
+			continue;
+		}
+	}
+	nvdimm_bus_unlock(&nvdimm_bus->dev);
+}
+EXPORT_SYMBOL_GPL(nvdimm_clear_from_poison_list);
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index b519e13..bbfce62 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -129,6 +129,8 @@ static inline struct nd_blk_region_desc *to_blk_region_desc(
 }
 
 int nvdimm_bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length);
+void nvdimm_clear_from_poison_list(struct nvdimm_bus *nvdimm_bus,
+		phys_addr_t start, unsigned int len);
 struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
 		struct nvdimm_bus_descriptor *nfit_desc);
 void nvdimm_bus_unregister(struct nvdimm_bus *nvdimm_bus);
-- 
2.7.4

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 3/3] libnvdimm: clear the internal poison_list when clearing badblocks
@ 2016-09-30 20:10   ` Vishal Verma
  0 siblings, 0 replies; 14+ messages in thread
From: Vishal Verma @ 2016-09-30 20:10 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Dan Williams, Ross Zwisler, Linda Knippers, linux-acpi,
	Rafael J. Wysocki, Vishal Verma

nvdimm_clear_poison cleared the user-visible badblocks, and sent
commands to the NVDIMM to clear the areas marked as 'poison', but it
neglected to clear the same areas from the internal poison_list which is
used to marshal ARS results before sorting them by namespace. As a
result, once on-demand ARS functionality was added:

37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand

A scrub triggered from either sysfs or an MCE was found to be adding
stale entries that had been cleared from gendisk->badblocks, but were
still present in nvdimm_bus->poison_list.

This adds the missing step of clearing poison_list entries when clearing
poison, so that it is in sync with badblocks.

Fixes: 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
---
 drivers/nvdimm/bus.c      |  2 ++
 drivers/nvdimm/core.c     | 73 ++++++++++++++++++++++++++++++++++++++++++++---
 include/linux/libnvdimm.h |  2 ++
 3 files changed, 73 insertions(+), 4 deletions(-)

diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 935866f..a8b6949 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -217,6 +217,8 @@ long nvdimm_clear_poison(struct device *dev, phys_addr_t phys,
 		return rc;
 	if (cmd_rc < 0)
 		return cmd_rc;
+
+	nvdimm_clear_from_poison_list(nvdimm_bus, phys, len);
 	return clear_err.cleared;
 }
 EXPORT_SYMBOL_GPL(nvdimm_clear_poison);
diff --git a/drivers/nvdimm/core.c b/drivers/nvdimm/core.c
index 715583f..42e40db 100644
--- a/drivers/nvdimm/core.c
+++ b/drivers/nvdimm/core.c
@@ -541,11 +541,12 @@ void nvdimm_badblocks_populate(struct nd_region *nd_region,
 }
 EXPORT_SYMBOL_GPL(nvdimm_badblocks_populate);
 
-static int add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
+static int add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length,
+			gfp_t flags)
 {
 	struct nd_poison *pl;
 
-	pl = kzalloc(sizeof(*pl), GFP_KERNEL);
+	pl = kzalloc(sizeof(*pl), flags);
 	if (!pl)
 		return -ENOMEM;
 
@@ -561,7 +562,7 @@ static int bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
 	struct nd_poison *pl;
 
 	if (list_empty(&nvdimm_bus->poison_list))
-		return add_poison(nvdimm_bus, addr, length);
+		return add_poison(nvdimm_bus, addr, length, GFP_KERNEL);
 
 	/*
 	 * There is a chance this is a duplicate, check for those first.
@@ -581,7 +582,7 @@ static int bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
 	 * as any overlapping ranges will get resolved when the list is consumed
 	 * and converted to badblocks
 	 */
-	return add_poison(nvdimm_bus, addr, length);
+	return add_poison(nvdimm_bus, addr, length, GFP_KERNEL);
 }
 
 int nvdimm_bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
@@ -596,6 +597,70 @@ int nvdimm_bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length)
 }
 EXPORT_SYMBOL_GPL(nvdimm_bus_add_poison);
 
+void nvdimm_clear_from_poison_list(struct nvdimm_bus *nvdimm_bus,
+		phys_addr_t start, unsigned int len)
+{
+	struct list_head *poison_list = &nvdimm_bus->poison_list;
+	u64 clr_end = start + len - 1;
+	struct nd_poison *pl, *next;
+
+	nvdimm_bus_lock(&nvdimm_bus->dev);
+	WARN_ON_ONCE(list_empty(poison_list));
+
+	/*
+	 * [start, clr_end] is the poison interval being cleared.
+	 * [pl->start, pl_end] is the poison_list entry we're comparing
+	 * the above interval against. The poison list entry may need
+	 * to be modified (update either start or length), deleted, or
+	 * split into two based on the overlap characteristics
+	 */
+
+	list_for_each_entry_safe(pl, next, poison_list, list) {
+		u64 pl_end = pl->start + pl->length - 1;
+
+		/* Skip intervals with no intersection */
+		if (pl_end < start)
+			continue;
+		if (pl->start >  clr_end)
+			continue;
+		/* Delete completely overlapped poison entries */
+		if ((pl->start >= start) && (pl_end <= clr_end)) {
+			list_del(&pl->list);
+			kfree(pl);
+			continue;
+		}
+		/* Adjust start point of partially cleared entries */
+		if ((start <= pl->start) && (clr_end > pl->start)) {
+			pl->length -= clr_end - pl->start + 1;
+			pl->start = clr_end + 1;
+			continue;
+		}
+		/* Adjust pl->length for partial clearing at the tail end */
+		if ((pl->start < start) && (pl_end <= clr_end)) {
+			/* pl->start remains the same */
+			pl->length = start - pl->start;
+			continue;
+		}
+		/*
+		 * If clearing in the middle of an entry, we split it into
+		 * two by modifying the current entry to represent one half of
+		 * the split, and adding a new entry for the second half.
+		 */
+		if ((pl->start < start) && (pl_end > clr_end)) {
+			u64 new_start = clr_end + 1;
+			u64 new_len = pl_end - new_start + 1;
+
+			/* Add new entry covering the right half */
+			add_poison(nvdimm_bus, new_start, new_len, GFP_NOIO);
+			/* Adjust this entry to cover the left half */
+			pl->length = start - pl->start;
+			continue;
+		}
+	}
+	nvdimm_bus_unlock(&nvdimm_bus->dev);
+}
+EXPORT_SYMBOL_GPL(nvdimm_clear_from_poison_list);
+
 #ifdef CONFIG_BLK_DEV_INTEGRITY
 int nd_integrity_init(struct gendisk *disk, unsigned long meta_size)
 {
diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
index b519e13..bbfce62 100644
--- a/include/linux/libnvdimm.h
+++ b/include/linux/libnvdimm.h
@@ -129,6 +129,8 @@ static inline struct nd_blk_region_desc *to_blk_region_desc(
 }
 
 int nvdimm_bus_add_poison(struct nvdimm_bus *nvdimm_bus, u64 addr, u64 length);
+void nvdimm_clear_from_poison_list(struct nvdimm_bus *nvdimm_bus,
+		phys_addr_t start, unsigned int len);
 struct nvdimm_bus *nvdimm_bus_register(struct device *parent,
 		struct nvdimm_bus_descriptor *nfit_desc);
 void nvdimm_bus_unregister(struct nvdimm_bus *nvdimm_bus);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] nfit: don't start a full scrub by default for an MCE
@ 2016-09-30 22:39     ` Dan Williams
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2016-09-30 22:39 UTC (permalink / raw)
  To: Vishal Verma; +Cc: Linux ACPI, Rafael J. Wysocki, linux-nvdimm

On Fri, Sep 30, 2016 at 1:10 PM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> Starting a full Address Range Scrub (ARS) on hitting a memory error
> machine check exception may not always be desirable. Provide a way
> through sysfs to toggle the behavior between just adding the address
> (cache line) where the MCE happened to the poison list and doing a full
> scrub. The former (selective insertion of the address) is done
> unconditionally.
>
> Cc: linux-acpi@vger.kernel.org
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Linda Knippers <linda.knippers@hpe.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  drivers/acpi/nfit/core.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/nfit/mce.c  | 24 ++++++++++++++++-----
>  drivers/acpi/nfit/nfit.h |  6 ++++++
>  3 files changed, 79 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index 80cc7c0..6a12bc2 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -878,6 +878,58 @@ static ssize_t revision_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(revision);
>
> +static ssize_t hw_error_scrub_show(struct device *dev,
> +               struct device_attribute *attr, char *buf)
> +{
> +       struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev);
> +       struct nvdimm_bus_descriptor *nd_desc = to_nd_desc(nvdimm_bus);
> +       struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
> +
> +       return sprintf(buf, "%d\n", acpi_desc->scrub_mode);
> +}
> +
> +/*
> + * The 'hw_error_scrub' attribute can have the following values written to it:
> + * '1': Enable a full scrub to happen if an exception for a memory error is
> + *      received.
> + * '2': Switch to the default mode where an exception will only insert
> + *      the address of the memory error into the poison and badblocks lists.

Should the default be zero instead of 2, to match what ->scrub_mode is
initialized to by default?

> + */
> +static ssize_t hw_error_scrub_store(struct device *dev,
> +               struct device_attribute *attr, const char *buf, size_t size)
> +{
> +       struct nvdimm_bus_descriptor *nd_desc;
> +       ssize_t rc;
> +       long val;
> +
> +       rc = kstrtol(buf, 0, &val);
> +       if (rc)
> +               return rc;
> +
> +       device_lock(dev);
> +       nd_desc = dev_get_drvdata(dev);
> +       if (nd_desc) {
> +               struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
> +
> +               switch (val) {
> +               case MCE_SCRUB_ON:
> +                       acpi_desc->scrub_mode = MCE_SCRUB_ON;

If we're calling the sysfs attribute "hw_error_scrub" to avoid the
x86-specific MCE naming, I think we should call these defines
HW_ERROR_SCRUB_*.

> +                       break;
> +               case MCE_SCRUB_OFF:
> +                       acpi_desc->scrub_mode = MCE_SCRUB_OFF;
> +                       break;
> +               default:
> +                       rc = -EINVAL;
> +                       break;
> +               }
> +       }
> +       device_unlock(dev);
> +       if (rc)
> +               return rc;
> +       return size;
> +}
> +static DEVICE_ATTR_RW(hw_error_scrub);
> +
>  /*
>   * This shows the number of full Address Range Scrubs that have been
>   * completed since driver load time. Userspace can wait on this using
> @@ -950,6 +1002,7 @@ static umode_t nfit_visible(struct kobject *kobj, struct attribute *a, int n)
>  static struct attribute *acpi_nfit_attributes[] = {
>         &dev_attr_revision.attr,
>         &dev_attr_scrub.attr,
> +       &dev_attr_hw_error_scrub.attr,
>         NULL,
>  };
>
> @@ -2489,6 +2542,7 @@ int acpi_nfit_init(struct acpi_nfit_desc *acpi_desc, void *data, acpi_size sz)
>                 goto out_unlock;
>         }
>
> +       acpi_desc->scrub_mode = MCE_SCRUB_OFF;

If we make HW_ERROR_SCRUB_OFF == 0, then we don't need this line.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 1/3] nfit: don't start a full scrub by default for an MCE
@ 2016-09-30 22:39     ` Dan Williams
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2016-09-30 22:39 UTC (permalink / raw)
  To: Vishal Verma
  Cc: Linux ACPI, Rafael J. Wysocki, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw

On Fri, Sep 30, 2016 at 1:10 PM, Vishal Verma <vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> Starting a full Address Range Scrub (ARS) on hitting a memory error
> machine check exception may not always be desirable. Provide a way
> through sysfs to toggle the behavior between just adding the address
> (cache line) where the MCE happened to the poison list and doing a full
> scrub. The former (selective insertion of the address) is done
> unconditionally.
>
> Cc: linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> Cc: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Cc: Linda Knippers <linda.knippers-ZPxbGqLxI0U@public.gmane.org>
> Cc: Rafael J. Wysocki <rafael.j.wysocki-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> Signed-off-by: Vishal Verma <vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> ---
>  drivers/acpi/nfit/core.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/nfit/mce.c  | 24 ++++++++++++++++-----
>  drivers/acpi/nfit/nfit.h |  6 ++++++
>  3 files changed, 79 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index 80cc7c0..6a12bc2 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -878,6 +878,58 @@ static ssize_t revision_show(struct device *dev,
>  }
>  static DEVICE_ATTR_RO(revision);
>
> +static ssize_t hw_error_scrub_show(struct device *dev,
> +               struct device_attribute *attr, char *buf)
> +{
> +       struct nvdimm_bus *nvdimm_bus = to_nvdimm_bus(dev);
> +       struct nvdimm_bus_descriptor *nd_desc = to_nd_desc(nvdimm_bus);
> +       struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
> +
> +       return sprintf(buf, "%d\n", acpi_desc->scrub_mode);
> +}
> +
> +/*
> + * The 'hw_error_scrub' attribute can have the following values written to it:
> + * '1': Enable a full scrub to happen if an exception for a memory error is
> + *      received.
> + * '2': Switch to the default mode where an exception will only insert
> + *      the address of the memory error into the poison and badblocks lists.

Should the default be zero instead of 2, to match what ->scrub_mode is
initialized to by default?

> + */
> +static ssize_t hw_error_scrub_store(struct device *dev,
> +               struct device_attribute *attr, const char *buf, size_t size)
> +{
> +       struct nvdimm_bus_descriptor *nd_desc;
> +       ssize_t rc;
> +       long val;
> +
> +       rc = kstrtol(buf, 0, &val);
> +       if (rc)
> +               return rc;
> +
> +       device_lock(dev);
> +       nd_desc = dev_get_drvdata(dev);
> +       if (nd_desc) {
> +               struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
> +
> +               switch (val) {
> +               case MCE_SCRUB_ON:
> +                       acpi_desc->scrub_mode = MCE_SCRUB_ON;

If we're calling the sysfs attribute "hw_error_scrub" to avoid the
x86-specific MCE naming, I think we should call these defines
HW_ERROR_SCRUB_*.

> +                       break;
> +               case MCE_SCRUB_OFF:
> +                       acpi_desc->scrub_mode = MCE_SCRUB_OFF;
> +                       break;
> +               default:
> +                       rc = -EINVAL;
> +                       break;
> +               }
> +       }
> +       device_unlock(dev);
> +       if (rc)
> +               return rc;
> +       return size;
> +}
> +static DEVICE_ATTR_RW(hw_error_scrub);
> +
>  /*
>   * This shows the number of full Address Range Scrubs that have been
>   * completed since driver load time. Userspace can wait on this using
> @@ -950,6 +1002,7 @@ static umode_t nfit_visible(struct kobject *kobj, struct attribute *a, int n)
>  static struct attribute *acpi_nfit_attributes[] = {
>         &dev_attr_revision.attr,
>         &dev_attr_scrub.attr,
> +       &dev_attr_hw_error_scrub.attr,
>         NULL,
>  };
>
> @@ -2489,6 +2542,7 @@ int acpi_nfit_init(struct acpi_nfit_desc *acpi_desc, void *data, acpi_size sz)
>                 goto out_unlock;
>         }
>
> +       acpi_desc->scrub_mode = MCE_SCRUB_OFF;

If we make HW_ERROR_SCRUB_OFF == 0, then we don't need this line.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] pmem: reduce kmap_atomic sections to the memcpys only
  2016-09-30 20:10   ` Vishal Verma
@ 2016-09-30 22:40     ` Dan Williams
  -1 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2016-09-30 22:40 UTC (permalink / raw)
  To: Vishal Verma; +Cc: Linux ACPI, Rafael J. Wysocki, linux-nvdimm

On Fri, Sep 30, 2016 at 1:10 PM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> pmem_do_bvec used to kmap_atomic at the begin, and only unmap at the
> end. Things like nvdimm_clear_poison may want to do nvdimm subsystem
> bookkeeping operations that may involve taking locks or doing memory
> allocations, and we can't do that from the atomic context. Reduce the
> atomic context to just what needs it - the memcpy to/from pmem.
>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  drivers/nvdimm/pmem.c | 28 +++++++++++++++++++++++-----
>  1 file changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> index 571a6c7..ca038d8 100644
> --- a/drivers/nvdimm/pmem.c
> +++ b/drivers/nvdimm/pmem.c
> @@ -66,13 +66,32 @@ static void pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset,
>         invalidate_pmem(pmem->virt_addr + offset, len);
>  }
>
> +static void map_and_copy_to_pmem(void *pmem_addr, struct page *page,
> +               unsigned int off, unsigned int len)
> +{
> +       void *mem = kmap_atomic(page);
> +
> +       memcpy_to_pmem(pmem_addr, mem + off, len);
> +       kunmap_atomic(mem);
> +}
> +
> +static int map_and_copy_from_pmem(struct page *page, unsigned int off,
> +               void *pmem_addr, unsigned int len)
> +{
> +       int rc;
> +       void *mem = kmap_atomic(page);
> +
> +       rc = memcpy_from_pmem(mem + off, pmem_addr, len);
> +       kunmap_atomic(mem);
> +       return rc;
> +}

Let's just name these write_pmem() and read_pmem() respectively.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 2/3] pmem: reduce kmap_atomic sections to the memcpys only
@ 2016-09-30 22:40     ` Dan Williams
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2016-09-30 22:40 UTC (permalink / raw)
  To: Vishal Verma
  Cc: linux-nvdimm, Ross Zwisler, Linda Knippers, Linux ACPI,
	Rafael J. Wysocki

On Fri, Sep 30, 2016 at 1:10 PM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> pmem_do_bvec used to kmap_atomic at the begin, and only unmap at the
> end. Things like nvdimm_clear_poison may want to do nvdimm subsystem
> bookkeeping operations that may involve taking locks or doing memory
> allocations, and we can't do that from the atomic context. Reduce the
> atomic context to just what needs it - the memcpy to/from pmem.
>
> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
> ---
>  drivers/nvdimm/pmem.c | 28 +++++++++++++++++++++++-----
>  1 file changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
> index 571a6c7..ca038d8 100644
> --- a/drivers/nvdimm/pmem.c
> +++ b/drivers/nvdimm/pmem.c
> @@ -66,13 +66,32 @@ static void pmem_clear_poison(struct pmem_device *pmem, phys_addr_t offset,
>         invalidate_pmem(pmem->virt_addr + offset, len);
>  }
>
> +static void map_and_copy_to_pmem(void *pmem_addr, struct page *page,
> +               unsigned int off, unsigned int len)
> +{
> +       void *mem = kmap_atomic(page);
> +
> +       memcpy_to_pmem(pmem_addr, mem + off, len);
> +       kunmap_atomic(mem);
> +}
> +
> +static int map_and_copy_from_pmem(struct page *page, unsigned int off,
> +               void *pmem_addr, unsigned int len)
> +{
> +       int rc;
> +       void *mem = kmap_atomic(page);
> +
> +       rc = memcpy_from_pmem(mem + off, pmem_addr, len);
> +       kunmap_atomic(mem);
> +       return rc;
> +}

Let's just name these write_pmem() and read_pmem() respectively.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] libnvdimm: clear the internal poison_list when clearing badblocks
  2016-09-30 20:10   ` Vishal Verma
@ 2016-09-30 22:48     ` Dan Williams
  -1 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2016-09-30 22:48 UTC (permalink / raw)
  To: Vishal Verma; +Cc: Linux ACPI, Rafael J. Wysocki, linux-nvdimm

On Fri, Sep 30, 2016 at 1:10 PM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> nvdimm_clear_poison cleared the user-visible badblocks, and sent
> commands to the NVDIMM to clear the areas marked as 'poison', but it
> neglected to clear the same areas from the internal poison_list which is
> used to marshal ARS results before sorting them by namespace. As a
> result, once on-demand ARS functionality was added:
>
> 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
>
> A scrub triggered from either sysfs or an MCE was found to be adding
> stale entries that had been cleared from gendisk->badblocks, but were
> still present in nvdimm_bus->poison_list.
>
> This adds the missing step of clearing poison_list entries when clearing
> poison, so that it is in sync with badblocks.
>
> Fixes: 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>

Looks good, when you resend the other patches amend this changelog to
also say that the cleared errors come back if you disable and
re-enable a namespace.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 3/3] libnvdimm: clear the internal poison_list when clearing badblocks
@ 2016-09-30 22:48     ` Dan Williams
  0 siblings, 0 replies; 14+ messages in thread
From: Dan Williams @ 2016-09-30 22:48 UTC (permalink / raw)
  To: Vishal Verma
  Cc: linux-nvdimm, Ross Zwisler, Linda Knippers, Linux ACPI,
	Rafael J. Wysocki

On Fri, Sep 30, 2016 at 1:10 PM, Vishal Verma <vishal.l.verma@intel.com> wrote:
> nvdimm_clear_poison cleared the user-visible badblocks, and sent
> commands to the NVDIMM to clear the areas marked as 'poison', but it
> neglected to clear the same areas from the internal poison_list which is
> used to marshal ARS results before sorting them by namespace. As a
> result, once on-demand ARS functionality was added:
>
> 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
>
> A scrub triggered from either sysfs or an MCE was found to be adding
> stale entries that had been cleared from gendisk->badblocks, but were
> still present in nvdimm_bus->poison_list.
>
> This adds the missing step of clearing poison_list entries when clearing
> poison, so that it is in sync with badblocks.
>
> Fixes: 37b137f nfit, libnvdimm: allow an ARS scrub to be triggered on demand
> Cc: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>

Looks good, when you resend the other patches amend this changelog to
also say that the cleared errors come back if you disable and
re-enable a namespace.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-09-30 22:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-30 20:10 [PATCH v2 0/3] misc updates for Address Range Scrub Vishal Verma
2016-09-30 20:10 ` Vishal Verma
2016-09-30 20:10 ` [PATCH v2 1/3] nfit: don't start a full scrub by default for an MCE Vishal Verma
2016-09-30 20:10   ` Vishal Verma
2016-09-30 22:39   ` Dan Williams
2016-09-30 22:39     ` Dan Williams
2016-09-30 20:10 ` [PATCH v2 2/3] pmem: reduce kmap_atomic sections to the memcpys only Vishal Verma
2016-09-30 20:10   ` Vishal Verma
2016-09-30 22:40   ` Dan Williams
2016-09-30 22:40     ` Dan Williams
2016-09-30 20:10 ` [PATCH v2 3/3] libnvdimm: clear the internal poison_list when clearing badblocks Vishal Verma
2016-09-30 20:10   ` Vishal Verma
2016-09-30 22:48   ` Dan Williams
2016-09-30 22:48     ` Dan Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.