linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution
@ 2019-02-14 20:10 Dan Williams
  2019-02-14 20:10 ` [PATCH 1/6] nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot Dan Williams
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm
  Cc: Krzysztof Rusocki, Vishal Verma, Toshi Kani, Erwin Tsaur, linux-kernel

Here is a small pile of updates to better coordinate the Linux ARS state
machine with platform-BIOS implementations. Specifically, take advantage
of opportunities to run short-ARS whenever the ARS interface is found to
be idle at init, always run short-ARS even if no_init_ars is specified,
allow root to reset the exponential backoff polling interval for ARS
completion, and protect the kernel against the consumption of stale ARS
results.

---

Dan Williams (6):
      nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot
      nfit/ars: Attempt short-ARS even in the no_init_ars case
      nfit/ars: Allow root to busy-poll the ARS state machine
      nfit/ars: Remove ars_start_flags
      nfit/ars: Introduce scrub_flags
      nfit/ars: Avoid stale ARS results


 drivers/acpi/nfit/core.c |   67 ++++++++++++++++++++++++++++++++--------------
 drivers/acpi/nfit/nfit.h |   10 +++++--
 2 files changed, 53 insertions(+), 24 deletions(-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot
  2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
@ 2019-02-14 20:10 ` Dan Williams
  2019-02-14 20:10 ` [PATCH 2/6] nfit/ars: Attempt short-ARS even in the no_init_ars case Dan Williams
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Krzysztof Rusocki, vishal.l.verma, linux-kernel

If query-ARS reports that ARS has stopped and requires continuation
attempt to retrieve short-ARS results before continuing the long
operation.

Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index e18ade5d74e9..3d681a92ff7f 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -3012,6 +3012,7 @@ static int ars_register(struct acpi_nfit_desc *acpi_desc,
 
 	switch (acpi_nfit_query_poison(acpi_desc)) {
 	case 0:
+	case -ENOSPC:
 	case -EAGAIN:
 		rc = ars_start(acpi_desc, nfit_spa, ARS_REQ_SHORT);
 		/* shouldn't happen, try again later */
@@ -3036,7 +3037,6 @@ static int ars_register(struct acpi_nfit_desc *acpi_desc,
 		break;
 	case -EBUSY:
 	case -ENOMEM:
-	case -ENOSPC:
 		/*
 		 * BIOS was using ARS, wait for it to complete (or
 		 * resources to become available) and then perform our


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/6] nfit/ars: Attempt short-ARS even in the no_init_ars case
  2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
  2019-02-14 20:10 ` [PATCH 1/6] nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot Dan Williams
@ 2019-02-14 20:10 ` Dan Williams
  2019-02-14 20:10 ` [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine Dan Williams
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Erwin Tsaur, vishal.l.verma, linux-kernel

The no_init_ars option is meant to prevent long-ARS, but short-ARS
should be allowed to grab any immediate results.

Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 3d681a92ff7f..934be96dc149 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -3004,11 +3004,12 @@ static int ars_register(struct acpi_nfit_desc *acpi_desc,
 {
 	int rc;
 
-	if (no_init_ars || test_bit(ARS_FAILED, &nfit_spa->ars_state))
+	if (test_bit(ARS_FAILED, &nfit_spa->ars_state))
 		return acpi_nfit_register_region(acpi_desc, nfit_spa);
 
 	set_bit(ARS_REQ_SHORT, &nfit_spa->ars_state);
-	set_bit(ARS_REQ_LONG, &nfit_spa->ars_state);
+	if (!no_init_ars)
+		set_bit(ARS_REQ_LONG, &nfit_spa->ars_state);
 
 	switch (acpi_nfit_query_poison(acpi_desc)) {
 	case 0:


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine
  2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
  2019-02-14 20:10 ` [PATCH 1/6] nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot Dan Williams
  2019-02-14 20:10 ` [PATCH 2/6] nfit/ars: Attempt short-ARS even in the no_init_ars case Dan Williams
@ 2019-02-14 20:10 ` Dan Williams
  2019-02-14 20:19   ` [PATCH v2] " Dan Williams
  2019-02-14 20:27   ` [PATCH 3/6] " Dan Williams
  2019-02-14 20:10 ` [PATCH 4/6] nfit/ars: Remove ars_start_flags Dan Williams
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Erwin Tsaur, vishal.l.verma, linux-kernel

The ARS implementation implements exponential back-off on the poll
interval to prevent high-frequency access to the DIMM / platform
interface. Depending on when the ARS completes the poll interval may
exceed the completion event by minutes. Allow root to reset the timeout
each time it probes the status. A one-second timeout is still enforced,
but root can otherwise can control the poll interval.

Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 934be96dc149..5ec0e19760ea 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1328,6 +1328,12 @@ static ssize_t scrub_show(struct device *dev,
 		rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,
 				acpi_desc->scrub_busy
 				&& !acpi_desc->cancel ? "+\n" : "\n");
+		/* Allow an admin to poll the busy state at a higher rate */
+		if (acpi_desc->scrub_busy && !acpi_desc->cancel
+				&& capable(CAP_SYS_RAWIO)) {
+			acpi_desc->scrub_tmo = 1;
+			queue_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);
+		}
 		mutex_unlock(&acpi_desc->init_mutex);
 	}
 	device_unlock(dev);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/6] nfit/ars: Remove ars_start_flags
  2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
                   ` (2 preceding siblings ...)
  2019-02-14 20:10 ` [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine Dan Williams
@ 2019-02-14 20:10 ` Dan Williams
  2019-02-14 20:10 ` [PATCH 5/6] nfit/ars: Introduce scrub_flags Dan Williams
  2019-02-14 20:10 ` [PATCH 6/6] nfit/ars: Avoid stale ARS results Dan Williams
  5 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Toshi Kani, vishal.l.verma, linux-kernel

The ars_start_flags property of 'struct acpi_nfit_desc' is no longer
used since ARS_REQ_SHORT and ARS_REQ_LONG were added.

Cc: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |   10 +++++-----
 drivers/acpi/nfit/nfit.h |    1 -
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 5ec0e19760ea..7086c193aa81 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2657,11 +2657,11 @@ static int ars_continue(struct acpi_nfit_desc *acpi_desc)
 	struct nvdimm_bus_descriptor *nd_desc = &acpi_desc->nd_desc;
 	struct nd_cmd_ars_status *ars_status = acpi_desc->ars_status;
 
-	memset(&ars_start, 0, sizeof(ars_start));
-	ars_start.address = ars_status->restart_address;
-	ars_start.length = ars_status->restart_length;
-	ars_start.type = ars_status->type;
-	ars_start.flags = acpi_desc->ars_start_flags;
+	ars_start = (struct nd_cmd_ars_start) {
+		.address = ars_status->restart_address,
+		.length = ars_status->restart_length,
+		.type = ars_status->type,
+	};
 	rc = nd_desc->ndctl(nd_desc, NULL, ND_CMD_ARS_START, &ars_start,
 			sizeof(ars_start), &cmd_rc);
 	if (rc < 0)
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 33691aecfcee..871fb3de3b30 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -223,7 +223,6 @@ struct acpi_nfit_desc {
 	struct list_head idts;
 	struct nvdimm_bus *nvdimm_bus;
 	struct device *dev;
-	u8 ars_start_flags;
 	struct nd_cmd_ars_status *ars_status;
 	struct nfit_spa *scrub_spa;
 	struct delayed_work dwork;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/6] nfit/ars: Introduce scrub_flags
  2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
                   ` (3 preceding siblings ...)
  2019-02-14 20:10 ` [PATCH 4/6] nfit/ars: Remove ars_start_flags Dan Williams
@ 2019-02-14 20:10 ` Dan Williams
  2019-02-14 20:10 ` [PATCH 6/6] nfit/ars: Avoid stale ARS results Dan Williams
  5 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: vishal.l.verma, linux-kernel

In preparation for introducing a new flag to gate whether ARS results
are stale, convert the existing flags to an unsigned long with
enumerated values. This conversion allows the flags to be atomically
updated outside of ->init_mutex.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |   39 +++++++++++++++++++++------------------
 drivers/acpi/nfit/nfit.h |    8 ++++++--
 2 files changed, 27 insertions(+), 20 deletions(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 7086c193aa81..612cf5d92e25 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1317,25 +1317,28 @@ static ssize_t scrub_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
 {
 	struct nvdimm_bus_descriptor *nd_desc;
+	struct acpi_nfit_desc *acpi_desc;
 	ssize_t rc = -ENXIO;
+	bool busy;
 
 	device_lock(dev);
 	nd_desc = dev_get_drvdata(dev);
-	if (nd_desc) {
-		struct acpi_nfit_desc *acpi_desc = to_acpi_desc(nd_desc);
+	if (!nd_desc) {
+		device_unlock(dev);
+		return rc;
+	}
+	acpi_desc = to_acpi_desc(nd_desc);
 
-		mutex_lock(&acpi_desc->init_mutex);
-		rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,
-				acpi_desc->scrub_busy
-				&& !acpi_desc->cancel ? "+\n" : "\n");
-		/* Allow an admin to poll the busy state at a higher rate */
-		if (acpi_desc->scrub_busy && !acpi_desc->cancel
-				&& capable(CAP_SYS_RAWIO)) {
-			acpi_desc->scrub_tmo = 1;
-			queue_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);
-		}
-		mutex_unlock(&acpi_desc->init_mutex);
+	mutex_lock(&acpi_desc->init_mutex);
+	busy = test_bit(ARS_BUSY, &acpi_desc->scrub_flags)
+		&& !test_bit(ARS_CANCEL, &acpi_desc->scrub_flags);
+	rc = sprintf(buf, "%d%s", acpi_desc->scrub_count, busy ? "+\n" : "\n");
+	/* Allow an admin to poll the busy state at a higher rate */
+	if (busy && capable(CAP_SYS_RAWIO)) {
+		acpi_desc->scrub_tmo = 1;
+		queue_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);
 	}
+	mutex_unlock(&acpi_desc->init_mutex);
 	device_unlock(dev);
 	return rc;
 }
@@ -3078,7 +3081,7 @@ static unsigned int __acpi_nfit_scrub(struct acpi_nfit_desc *acpi_desc,
 
 	lockdep_assert_held(&acpi_desc->init_mutex);
 
-	if (acpi_desc->cancel)
+	if (test_bit(ARS_CANCEL, &acpi_desc->scrub_flags))
 		return 0;
 
 	if (query_rc == -EBUSY) {
@@ -3152,7 +3155,7 @@ static void __sched_ars(struct acpi_nfit_desc *acpi_desc, unsigned int tmo)
 {
 	lockdep_assert_held(&acpi_desc->init_mutex);
 
-	acpi_desc->scrub_busy = 1;
+	set_bit(ARS_BUSY, &acpi_desc->scrub_flags);
 	/* note this should only be set from within the workqueue */
 	if (tmo)
 		acpi_desc->scrub_tmo = tmo;
@@ -3168,7 +3171,7 @@ static void notify_ars_done(struct acpi_nfit_desc *acpi_desc)
 {
 	lockdep_assert_held(&acpi_desc->init_mutex);
 
-	acpi_desc->scrub_busy = 0;
+	clear_bit(ARS_BUSY, &acpi_desc->scrub_flags);
 	acpi_desc->scrub_count++;
 	if (acpi_desc->scrub_count_state)
 		sysfs_notify_dirent(acpi_desc->scrub_count_state);
@@ -3457,7 +3460,7 @@ int acpi_nfit_ars_rescan(struct acpi_nfit_desc *acpi_desc,
 	struct nfit_spa *nfit_spa;
 
 	mutex_lock(&acpi_desc->init_mutex);
-	if (acpi_desc->cancel) {
+	if (test_bit(ARS_CANCEL, &acpi_desc->scrub_flags)) {
 		mutex_unlock(&acpi_desc->init_mutex);
 		return 0;
 	}
@@ -3536,7 +3539,7 @@ void acpi_nfit_shutdown(void *data)
 	mutex_unlock(&acpi_desc_lock);
 
 	mutex_lock(&acpi_desc->init_mutex);
-	acpi_desc->cancel = 1;
+	set_bit(ARS_CANCEL, &acpi_desc->scrub_flags);
 	cancel_delayed_work_sync(&acpi_desc->dwork);
 	mutex_unlock(&acpi_desc->init_mutex);
 
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 871fb3de3b30..897ce10192a0 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -210,6 +210,11 @@ struct nfit_mem {
 	int family;
 };
 
+enum scrub_flags {
+	ARS_BUSY,
+	ARS_CANCEL,
+};
+
 struct acpi_nfit_desc {
 	struct nvdimm_bus_descriptor nd_desc;
 	struct acpi_table_header acpi_header;
@@ -231,8 +236,7 @@ struct acpi_nfit_desc {
 	unsigned int max_ars;
 	unsigned int scrub_count;
 	unsigned int scrub_mode;
-	unsigned int scrub_busy:1;
-	unsigned int cancel:1;
+	unsigned long scrub_flags;
 	unsigned long dimm_cmd_force_en;
 	unsigned long bus_cmd_force_en;
 	unsigned long bus_nfit_cmd_force_en;


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/6] nfit/ars: Avoid stale ARS results
  2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
                   ` (4 preceding siblings ...)
  2019-02-14 20:10 ` [PATCH 5/6] nfit/ars: Introduce scrub_flags Dan Williams
@ 2019-02-14 20:10 ` Dan Williams
  5 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:10 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Krzysztof Rusocki, Vishal Verma, linux-kernel

Gate ARS result consumption on whether the OS issued start-ARS since the
previous consumption. The BIOS may only clear its result buffers after a
successful start-ARS.

Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/acpi/nfit/core.c |   17 ++++++++++++++++-
 drivers/acpi/nfit/nfit.h |    1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 612cf5d92e25..2a49051d5317 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -2650,7 +2650,10 @@ static int ars_start(struct acpi_nfit_desc *acpi_desc,
 
 	if (rc < 0)
 		return rc;
-	return cmd_rc;
+	if (cmd_rc < 0)
+		return cmd_rc;
+	set_bit(ARS_VALID, &acpi_desc->scrub_flags);
+	return 0;
 }
 
 static int ars_continue(struct acpi_nfit_desc *acpi_desc)
@@ -2743,6 +2746,17 @@ static int ars_status_process_records(struct acpi_nfit_desc *acpi_desc)
 	 */
 	if (ars_status->out_length < 44)
 		return 0;
+
+	/*
+	 * Ignore potentially stale results that are only refreshed
+	 * after a start-ARS event.
+	 */
+	if (!test_and_clear_bit(ARS_VALID, &acpi_desc->scrub_flags)) {
+		dev_dbg(acpi_desc->dev, "skip %d stale records\n",
+				ars_status->num_records);
+		return 0;
+	}
+
 	for (i = 0; i < ars_status->num_records; i++) {
 		/* only process full records */
 		if (ars_status->out_length
@@ -3226,6 +3240,7 @@ static int acpi_nfit_register_regions(struct acpi_nfit_desc *acpi_desc)
 	struct nfit_spa *nfit_spa;
 	int rc;
 
+	set_bit(ARS_VALID, &acpi_desc->scrub_flags);
 	list_for_each_entry(nfit_spa, &acpi_desc->spas, list) {
 		switch (nfit_spa_type(nfit_spa->spa)) {
 		case NFIT_SPA_VOLATILE:
diff --git a/drivers/acpi/nfit/nfit.h b/drivers/acpi/nfit/nfit.h
index 897ce10192a0..1efd248104e0 100644
--- a/drivers/acpi/nfit/nfit.h
+++ b/drivers/acpi/nfit/nfit.h
@@ -213,6 +213,7 @@ struct nfit_mem {
 enum scrub_flags {
 	ARS_BUSY,
 	ARS_CANCEL,
+	ARS_VALID,
 };
 
 struct acpi_nfit_desc {


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2] nfit/ars: Allow root to busy-poll the ARS state machine
  2019-02-14 20:10 ` [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine Dan Williams
@ 2019-02-14 20:19   ` Dan Williams
  2019-02-15 16:43     ` Dan Williams
  2019-02-14 20:27   ` [PATCH 3/6] " Dan Williams
  1 sibling, 1 reply; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:19 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Erwin Tsaur, linux-kernel

The ARS implementation implements exponential back-off on the poll
interval to prevent high-frequency access to the DIMM / platform
interface. Depending on when the ARS completes the poll interval may
exceed the completion event by minutes. Allow root to reset the timeout
each time it probes the status. A one-second timeout is still enforced,
but root can otherwise can control the poll interval.

Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
Change since v1: Use mod_delayed_work() instead of queue_delayed_work()
to modify the timeout for existing work.

 drivers/acpi/nfit/core.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 934be96dc149..b1ab593a808a 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1328,6 +1328,12 @@ static ssize_t scrub_show(struct device *dev,
 		rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,
 				acpi_desc->scrub_busy
 				&& !acpi_desc->cancel ? "+\n" : "\n");
+		/* Allow an admin to poll the busy state at a higher rate */
+		if (acpi_desc->scrub_busy && !acpi_desc->cancel
+				&& capable(CAP_SYS_RAWIO)) {
+			acpi_desc->scrub_tmo = 1;
+			mod_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);
+		}
 		mutex_unlock(&acpi_desc->init_mutex);
 	}
 	device_unlock(dev);


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine
  2019-02-14 20:10 ` [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine Dan Williams
  2019-02-14 20:19   ` [PATCH v2] " Dan Williams
@ 2019-02-14 20:27   ` Dan Williams
  1 sibling, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-14 20:27 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Erwin Tsaur, Vishal L Verma, Linux Kernel Mailing List

On Thu, Feb 14, 2019 at 12:23 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> The ARS implementation implements exponential back-off on the poll
> interval to prevent high-frequency access to the DIMM / platform
> interface. Depending on when the ARS completes the poll interval may
> exceed the completion event by minutes. Allow root to reset the timeout
> each time it probes the status. A one-second timeout is still enforced,
> but root can otherwise can control the poll interval.
>
> Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/acpi/nfit/core.c |    6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index 934be96dc149..5ec0e19760ea 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -1328,6 +1328,12 @@ static ssize_t scrub_show(struct device *dev,
>                 rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,
>                                 acpi_desc->scrub_busy
>                                 && !acpi_desc->cancel ? "+\n" : "\n");
> +               /* Allow an admin to poll the busy state at a higher rate */
> +               if (acpi_desc->scrub_busy && !acpi_desc->cancel
> +                               && capable(CAP_SYS_RAWIO)) {
> +                       acpi_desc->scrub_tmo = 1;
> +                       queue_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);

Whoops, just realized this should be mod_delayed_work().
queue_delayed_work() is a nop if the work is already scheduled. v2
inbound.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] nfit/ars: Allow root to busy-poll the ARS state machine
  2019-02-14 20:19   ` [PATCH v2] " Dan Williams
@ 2019-02-15 16:43     ` Dan Williams
  0 siblings, 0 replies; 10+ messages in thread
From: Dan Williams @ 2019-02-15 16:43 UTC (permalink / raw)
  To: linux-nvdimm; +Cc: Erwin Tsaur, Linux Kernel Mailing List

On Thu, Feb 14, 2019 at 12:32 PM Dan Williams <dan.j.williams@intel.com> wrote:
>
> The ARS implementation implements exponential back-off on the poll
> interval to prevent high-frequency access to the DIMM / platform
> interface. Depending on when the ARS completes the poll interval may
> exceed the completion event by minutes. Allow root to reset the timeout
> each time it probes the status. A one-second timeout is still enforced,
> but root can otherwise can control the poll interval.
>
> Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> Change since v1: Use mod_delayed_work() instead of queue_delayed_work()
> to modify the timeout for existing work.
>
>  drivers/acpi/nfit/core.c |    6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index 934be96dc149..b1ab593a808a 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -1328,6 +1328,12 @@ static ssize_t scrub_show(struct device *dev,
>                 rc = sprintf(buf, "%d%s", acpi_desc->scrub_count,
>                                 acpi_desc->scrub_busy
>                                 && !acpi_desc->cancel ? "+\n" : "\n");
> +               /* Allow an admin to poll the busy state at a higher rate */
> +               if (acpi_desc->scrub_busy && !acpi_desc->cancel
> +                               && capable(CAP_SYS_RAWIO)) {
> +                       acpi_desc->scrub_tmo = 1;
> +                       mod_delayed_work(nfit_wq, &acpi_desc->dwork, HZ);
> +               }

I added support to ndctl to specify a poll-interval to 'ndctl
wait-scrub'. The support highlighted an infinite loop problem as
mod_delayed_work() called in a loop prevented the workqueue from ever
running. A revised patch-set adds a new ARS_POLL flag to ensure that
the workqueue runs at least once after every mod_delayed_work().

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-02-15 16:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-14 20:10 [PATCH 0/6] nfit/ars: Improve polling and short-ARS execution Dan Williams
2019-02-14 20:10 ` [PATCH 1/6] nfit/ars: Attempt a short-ARS whenever the ARS state is idle at boot Dan Williams
2019-02-14 20:10 ` [PATCH 2/6] nfit/ars: Attempt short-ARS even in the no_init_ars case Dan Williams
2019-02-14 20:10 ` [PATCH 3/6] nfit/ars: Allow root to busy-poll the ARS state machine Dan Williams
2019-02-14 20:19   ` [PATCH v2] " Dan Williams
2019-02-15 16:43     ` Dan Williams
2019-02-14 20:27   ` [PATCH 3/6] " Dan Williams
2019-02-14 20:10 ` [PATCH 4/6] nfit/ars: Remove ars_start_flags Dan Williams
2019-02-14 20:10 ` [PATCH 5/6] nfit/ars: Introduce scrub_flags Dan Williams
2019-02-14 20:10 ` [PATCH 6/6] nfit/ars: Avoid stale ARS results Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).