All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] spi: A better solution for cros_ec_spi reliability
@ 2019-05-14 18:39 Douglas Anderson
  2019-05-14 18:39 ` [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers Douglas Anderson
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Douglas Anderson @ 2019-05-14 18:39 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel, linux-spi

This series is a much better solution for getting the Chrome OS EC to
talk reliably.

Patch #1 in this series is the most important.  It can land any time.

Patch #2 in this series (a SPI framework patch) needs to land before
patch #3.  Note that patches #2 and #3 really just fix a corner case
and just having patch #1 is the most important.  We don't end up on
the pumping thread very often.

Note:
- If you want some history on investigation done here, feel free to
  peruse the Chrome OS bug: <https://crbug.com/948742>.

Changes in v3:
- cros_ec realtime patch replaces revert; now patch #1
- SPI core change now like patch v1 patch #2 (with name "rt").
- Updated description and variable name since we no longer force.

Changes in v2:
- Now only force transfers to the thread for devices that want it.
- Squashed patch #1 and #2 together.
- Renamed variable to "force_rt_transfers".
- Renamed variable to "force_rt_transfers".

Douglas Anderson (3):
  platform/chrome: cros_ec_spi: Move to real time priority for transfers
  spi: Allow SPI devices to request the pumping thread be realtime
  platform/chrome: cros_ec_spi: Request the SPI thread be realtime

 drivers/platform/chrome/cros_ec_spi.c | 89 +++++++++++++++++++++++----
 drivers/spi/spi.c                     | 36 +++++++++--
 include/linux/spi/spi.h               |  2 +
 3 files changed, 110 insertions(+), 17 deletions(-)

-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers
  2019-05-14 18:39 [PATCH v3 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
@ 2019-05-14 18:39 ` Douglas Anderson
  2019-05-14 19:34   ` Guenter Roeck
       [not found] ` <20190514183935.143463-1-dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  2019-05-14 18:39 ` [PATCH v3 3/3] platform/chrome: cros_ec_spi: Request the SPI " Douglas Anderson
  2 siblings, 1 reply; 8+ messages in thread
From: Douglas Anderson @ 2019-05-14 18:39 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel

In commit 37a186225a0c ("platform/chrome: cros_ec_spi: Transfer
messages at high priority") we moved transfers to a high priority
workqueue.  This helped make them much more reliable.

...but, we still saw failures.

We were actually finding ourselves competing for time with dm-crypt
which also scheduled work on HIGHPRI workqueues.  While we can
consider reverting the change that made dm-crypt run its work at
HIGHPRI, the argument in commit a1b89132dc4f ("dm crypt: use
WQ_HIGHPRI for the IO and crypt workqueues") is somewhat compelling.
It does make sense for IO to be scheduled at a priority that's higher
than the default user priority.  It also turns out that dm-crypt isn't
alone in using high priority like this.  loop_prepare_queue() does
something similar for loopback devices.

Looking in more detail, it can be seen that the high priority
workqueue isn't actually that high of a priority.  It runs at MIN_NICE
which is _fairly_ high priority but still below all real time
priority.

Should we move cros_ec_spi to real time priority to fix our problems,
or is this just escalating a priority war?  I'll argue here that
cros_ec_spi _does_ belong at real time priority.  Specifically
cros_ec_spi actually needs to run quickly for correctness.  As I
understand this is exactly what real time priority is for.

There currently doesn't appear to be any way to use the standard
workqueue APIs with a real time priority, so we'll switch over to
using using a kthread worker.  We'll match the priority that the SPI
core uses when it wants to do things on a realtime thread and just use
"MAX_RT_PRIO - 1".

This commit plus the patch ("platform/chrome: cros_ec_spi: Request the
SPI thread be realtime") are enough to get communications very close
to 100% reliable (the only known problem left is when serial console
is turned on, which isn't something that happens in shipping devices).
Specifically this test case now passes (tested on rk3288-veyron-jerry):

  dd if=/dev/zero of=/var/log/foo.txt bs=4M count=512&
  while true; do
    ectool version > /dev/null;
  done

It should be noted that "/var/log" is encrypted (and goes through
dm-crypt) and also passes through a loopback device.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- cros_ec realtime patch replaces revert; now patch #1

Changes in v2: None

 drivers/platform/chrome/cros_ec_spi.c | 88 +++++++++++++++++++++++----
 1 file changed, 77 insertions(+), 11 deletions(-)

diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c
index 8e9451720e73..b89bf11dda64 100644
--- a/drivers/platform/chrome/cros_ec_spi.c
+++ b/drivers/platform/chrome/cros_ec_spi.c
@@ -13,6 +13,8 @@
 #include <linux/slab.h>
 #include <linux/spi/spi.h>
 
+#include <uapi/linux/sched/types.h>
+
 
 /* The header byte, which follows the preamble */
 #define EC_MSG_HEADER			0xec
@@ -67,12 +69,16 @@
  *      is sent when we want to turn on CS at the start of a transaction.
  * @end_of_msg_delay: used to set the delay_usecs on the spi_transfer that
  *      is sent when we want to turn off CS at the end of a transaction.
+ * @high_pri_worker: Used to give work to high_pri_thread.
+ * @high_pri_thread: We'll do our transfers here to reduce preemption problems.
  */
 struct cros_ec_spi {
 	struct spi_device *spi;
 	s64 last_transfer_ns;
 	unsigned int start_of_msg_delay;
 	unsigned int end_of_msg_delay;
+	struct kthread_worker high_pri_worker;
+	struct task_struct *high_pri_thread;
 };
 
 typedef int (*cros_ec_xfer_fn_t) (struct cros_ec_device *ec_dev,
@@ -89,7 +95,7 @@ typedef int (*cros_ec_xfer_fn_t) (struct cros_ec_device *ec_dev,
  */
 
 struct cros_ec_xfer_work_params {
-	struct work_struct work;
+	struct kthread_work work;
 	cros_ec_xfer_fn_t fn;
 	struct cros_ec_device *ec_dev;
 	struct cros_ec_command *ec_msg;
@@ -632,7 +638,7 @@ static int do_cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
 	return ret;
 }
 
-static void cros_ec_xfer_high_pri_work(struct work_struct *work)
+static void cros_ec_xfer_high_pri_work(struct kthread_work *work)
 {
 	struct cros_ec_xfer_work_params *params;
 
@@ -644,12 +650,14 @@ static int cros_ec_xfer_high_pri(struct cros_ec_device *ec_dev,
 				 struct cros_ec_command *ec_msg,
 				 cros_ec_xfer_fn_t fn)
 {
-	struct cros_ec_xfer_work_params params;
-
-	INIT_WORK_ONSTACK(&params.work, cros_ec_xfer_high_pri_work);
-	params.ec_dev = ec_dev;
-	params.ec_msg = ec_msg;
-	params.fn = fn;
+	struct cros_ec_spi *ec_spi = ec_dev->priv;
+	struct cros_ec_xfer_work_params params = {
+		.work = KTHREAD_WORK_INIT(params.work,
+					  cros_ec_xfer_high_pri_work),
+		.ec_dev = ec_dev,
+		.ec_msg = ec_msg,
+		.fn = fn,
+	};
 
 	/*
 	 * This looks a bit ridiculous.  Why do the work on a
@@ -660,9 +668,8 @@ static int cros_ec_xfer_high_pri(struct cros_ec_device *ec_dev,
 	 * context switched out for too long and the EC giving up on
 	 * the transfer.
 	 */
-	queue_work(system_highpri_wq, &params.work);
-	flush_work(&params.work);
-	destroy_work_on_stack(&params.work);
+	kthread_queue_work(&ec_spi->high_pri_worker, &params.work);
+	kthread_flush_work(&params.work);
 
 	return params.ret;
 }
@@ -694,6 +701,61 @@ static void cros_ec_spi_dt_probe(struct cros_ec_spi *ec_spi, struct device *dev)
 		ec_spi->end_of_msg_delay = val;
 }
 
+static void cros_ec_spi_high_pri_release(struct device *dev, void *res)
+{
+	struct cros_ec_spi *ec_spi = *(struct cros_ec_spi **)res;
+
+	kthread_stop(ec_spi->high_pri_thread);
+	kthread_destroy_worker(&ec_spi->high_pri_worker);
+}
+
+static int cros_ec_spi_devm_high_pri_alloc(struct device *dev,
+					   struct cros_ec_spi *ec_spi)
+{
+	struct sched_param sched_priority = {
+		.sched_priority = MAX_RT_PRIO - 1,
+	};
+	struct cros_ec_spi **ptr;
+	int err = 0;
+
+	ptr = devres_alloc(cros_ec_spi_high_pri_release, sizeof(*ptr),
+			   GFP_KERNEL);
+	if (!ptr)
+		return -ENOMEM;
+	*ptr = ec_spi;
+
+	kthread_init_worker(&ec_spi->high_pri_worker);
+	ec_spi->high_pri_thread = kthread_create(kthread_worker_fn,
+						 &ec_spi->high_pri_worker,
+						 "cros_ec_spi_high_pri");
+	if (IS_ERR(ec_spi->high_pri_thread)) {
+		err = PTR_ERR(ec_spi->high_pri_thread);
+		dev_err(dev, "Can't create cros_ec high pri thread: %d\n", err);
+		goto err_worker_initted;
+	}
+
+	err = sched_setscheduler_nocheck(ec_spi->high_pri_thread,
+					 SCHED_FIFO, &sched_priority);
+	if (err) {
+		dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err);
+		goto err_thread_created;
+	}
+
+	wake_up_process(ec_spi->high_pri_thread);
+
+	devres_add(dev, ptr);
+
+	return 0;
+
+err_thread_created:
+	kthread_stop(ec_spi->high_pri_thread);
+
+err_worker_initted:
+	kthread_destroy_worker(&ec_spi->high_pri_worker);
+	devres_free(ptr);
+	return err;
+}
+
 static int cros_ec_spi_probe(struct spi_device *spi)
 {
 	struct device *dev = &spi->dev;
@@ -732,6 +794,10 @@ static int cros_ec_spi_probe(struct spi_device *spi)
 
 	ec_spi->last_transfer_ns = ktime_get_ns();
 
+	err = cros_ec_spi_devm_high_pri_alloc(dev, ec_spi);
+	if (err)
+		return err;
+
 	err = cros_ec_register(ec_dev);
 	if (err) {
 		dev_err(dev, "cannot register EC\n");
-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] spi: Allow SPI devices to request the pumping thread be realtime
  2019-05-14 18:39 [PATCH v3 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
@ 2019-05-14 18:39     ` Douglas Anderson
       [not found] ` <20190514183935.143463-1-dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
  2019-05-14 18:39 ` [PATCH v3 3/3] platform/chrome: cros_ec_spi: Request the SPI " Douglas Anderson
  2 siblings, 0 replies; 8+ messages in thread
From: Douglas Anderson @ 2019-05-14 18:39 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: drinkcat-F7+t8E8rja9g9hUCZPvPmw,
	briannorris-F7+t8E8rja9g9hUCZPvPmw, Douglas Anderson,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rockchip-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	mka-F7+t8E8rja9g9hUCZPvPmw, Guenter Roeck,
	linux-spi-u79uwXL29TY76Z2rM5mHXA

Right now the only way to get the SPI pumping thread bumped up to
realtime priority is for the controller to request it.  However it may
be that the controller works fine with the normal priority but
communication to a particular SPI device on the bus needs realtime
priority.

Let's add a way for devices to request realtime priority when they set
themselves up.

NOTE: this will just affect the priority of transfers that end up on
the SPI core's pumping thread.  In many cases transfers happen in the
context of the caller so if you need realtime priority for all
transfers you should ensure the calling context is also realtime
priority.

Signed-off-by: Douglas Anderson <dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
---

Changes in v3:
- SPI core change now like patch v1 patch #2 (with name "rt").

Changes in v2:
- Now only force transfers to the thread for devices that want it.
- Squashed patch #1 and #2 together.
- Renamed variable to "force_rt_transfers".

 drivers/spi/spi.c       | 36 ++++++++++++++++++++++++++++++------
 include/linux/spi/spi.h |  2 ++
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 8eb7460dd744..466984796dd9 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1364,10 +1364,32 @@ static void spi_pump_messages(struct kthread_work *work)
 	__spi_pump_messages(ctlr, true);
 }
 
-static int spi_init_queue(struct spi_controller *ctlr)
+/**
+ * spi_set_thread_rt - set the controller to pump at realtime priority
+ * @ctlr: controller to boost priority of
+ *
+ * This can be called because the controller requested realtime priority
+ * (by setting the ->rt value before calling spi_register_controller()) or
+ * because a device on the bus said that its transfers needed realtime
+ * priority.
+ *
+ * NOTE: at the moment if any device on a bus says it needs realtime then
+ * the thread will be at realtime priority for all transfers on that
+ * controller.  If this eventually becomes a problem we may see if we can
+ * find a way to boost the priority only temporarily during relevant
+ * transfers.
+ */
+static void spi_set_thread_rt(struct spi_controller *ctlr)
 {
 	struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
 
+	dev_info(&ctlr->dev,
+		"will run message pump with realtime priority\n");
+	sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
+}
+
+static int spi_init_queue(struct spi_controller *ctlr)
+{
 	ctlr->running = false;
 	ctlr->busy = false;
 
@@ -1387,11 +1409,8 @@ static int spi_init_queue(struct spi_controller *ctlr)
 	 * request and the scheduling of the message pump thread. Without this
 	 * setting the message pump thread will remain at default priority.
 	 */
-	if (ctlr->rt) {
-		dev_info(&ctlr->dev,
-			"will run message pump with realtime priority\n");
-		sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
-	}
+	if (ctlr->rt)
+		spi_set_thread_rt(ctlr);
 
 	return 0;
 }
@@ -2982,6 +3001,11 @@ int spi_setup(struct spi_device *spi)
 
 	spi_set_cs(spi, false);
 
+	if (spi->rt && !spi->controller->rt) {
+		spi->controller->rt = true;
+		spi_set_thread_rt(spi->controller);
+	}
+
 	dev_dbg(&spi->dev, "setup mode %d, %s%s%s%s%u bits/w, %u Hz max --> %d\n",
 			(int) (spi->mode & (SPI_CPOL | SPI_CPHA)),
 			(spi->mode & SPI_CS_HIGH) ? "cs_high, " : "",
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 053abd22ad31..15505c2485d6 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -109,6 +109,7 @@ void spi_statistics_add_transfer_stats(struct spi_statistics *stats,
  *	This may be changed by the device's driver, or left at the
  *	default (0) indicating protocol words are eight bit bytes.
  *	The spi_transfer.bits_per_word can override this for each transfer.
+ * @rt: Make the pump thread real time priority.
  * @irq: Negative, or the number passed to request_irq() to receive
  *	interrupts from this device.
  * @controller_state: Controller's runtime state
@@ -143,6 +144,7 @@ struct spi_device {
 	u32			max_speed_hz;
 	u8			chip_select;
 	u8			bits_per_word;
+	bool			rt;
 	u32			mode;
 #define	SPI_CPHA	0x01			/* clock phase */
 #define	SPI_CPOL	0x02			/* clock polarity */
-- 
2.21.0.1020.gf2820cf01a-goog

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 2/3] spi: Allow SPI devices to request the pumping thread be realtime
@ 2019-05-14 18:39     ` Douglas Anderson
  0 siblings, 0 replies; 8+ messages in thread
From: Douglas Anderson @ 2019-05-14 18:39 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel, linux-spi

Right now the only way to get the SPI pumping thread bumped up to
realtime priority is for the controller to request it.  However it may
be that the controller works fine with the normal priority but
communication to a particular SPI device on the bus needs realtime
priority.

Let's add a way for devices to request realtime priority when they set
themselves up.

NOTE: this will just affect the priority of transfers that end up on
the SPI core's pumping thread.  In many cases transfers happen in the
context of the caller so if you need realtime priority for all
transfers you should ensure the calling context is also realtime
priority.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- SPI core change now like patch v1 patch #2 (with name "rt").

Changes in v2:
- Now only force transfers to the thread for devices that want it.
- Squashed patch #1 and #2 together.
- Renamed variable to "force_rt_transfers".

 drivers/spi/spi.c       | 36 ++++++++++++++++++++++++++++++------
 include/linux/spi/spi.h |  2 ++
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 8eb7460dd744..466984796dd9 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1364,10 +1364,32 @@ static void spi_pump_messages(struct kthread_work *work)
 	__spi_pump_messages(ctlr, true);
 }
 
-static int spi_init_queue(struct spi_controller *ctlr)
+/**
+ * spi_set_thread_rt - set the controller to pump at realtime priority
+ * @ctlr: controller to boost priority of
+ *
+ * This can be called because the controller requested realtime priority
+ * (by setting the ->rt value before calling spi_register_controller()) or
+ * because a device on the bus said that its transfers needed realtime
+ * priority.
+ *
+ * NOTE: at the moment if any device on a bus says it needs realtime then
+ * the thread will be at realtime priority for all transfers on that
+ * controller.  If this eventually becomes a problem we may see if we can
+ * find a way to boost the priority only temporarily during relevant
+ * transfers.
+ */
+static void spi_set_thread_rt(struct spi_controller *ctlr)
 {
 	struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
 
+	dev_info(&ctlr->dev,
+		"will run message pump with realtime priority\n");
+	sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
+}
+
+static int spi_init_queue(struct spi_controller *ctlr)
+{
 	ctlr->running = false;
 	ctlr->busy = false;
 
@@ -1387,11 +1409,8 @@ static int spi_init_queue(struct spi_controller *ctlr)
 	 * request and the scheduling of the message pump thread. Without this
 	 * setting the message pump thread will remain at default priority.
 	 */
-	if (ctlr->rt) {
-		dev_info(&ctlr->dev,
-			"will run message pump with realtime priority\n");
-		sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
-	}
+	if (ctlr->rt)
+		spi_set_thread_rt(ctlr);
 
 	return 0;
 }
@@ -2982,6 +3001,11 @@ int spi_setup(struct spi_device *spi)
 
 	spi_set_cs(spi, false);
 
+	if (spi->rt && !spi->controller->rt) {
+		spi->controller->rt = true;
+		spi_set_thread_rt(spi->controller);
+	}
+
 	dev_dbg(&spi->dev, "setup mode %d, %s%s%s%s%u bits/w, %u Hz max --> %d\n",
 			(int) (spi->mode & (SPI_CPOL | SPI_CPHA)),
 			(spi->mode & SPI_CS_HIGH) ? "cs_high, " : "",
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 053abd22ad31..15505c2485d6 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -109,6 +109,7 @@ void spi_statistics_add_transfer_stats(struct spi_statistics *stats,
  *	This may be changed by the device's driver, or left at the
  *	default (0) indicating protocol words are eight bit bytes.
  *	The spi_transfer.bits_per_word can override this for each transfer.
+ * @rt: Make the pump thread real time priority.
  * @irq: Negative, or the number passed to request_irq() to receive
  *	interrupts from this device.
  * @controller_state: Controller's runtime state
@@ -143,6 +144,7 @@ struct spi_device {
 	u32			max_speed_hz;
 	u8			chip_select;
 	u8			bits_per_word;
+	bool			rt;
 	u32			mode;
 #define	SPI_CPHA	0x01			/* clock phase */
 #define	SPI_CPOL	0x02			/* clock polarity */
-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 3/3] platform/chrome: cros_ec_spi: Request the SPI thread be realtime
  2019-05-14 18:39 [PATCH v3 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
  2019-05-14 18:39 ` [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers Douglas Anderson
       [not found] ` <20190514183935.143463-1-dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
@ 2019-05-14 18:39 ` Douglas Anderson
  2 siblings, 0 replies; 8+ messages in thread
From: Douglas Anderson @ 2019-05-14 18:39 UTC (permalink / raw)
  To: Mark Brown, Benson Leung, Enric Balletbo i Serra
  Cc: linux-rockchip, drinkcat, Guenter Roeck, briannorris, mka,
	Douglas Anderson, linux-kernel

All currently known ECs in the wild are very sensitive to timing.
Specifically the ECs are known to drop a transfer if more than 8 ms
passes from the assertion of the chip select until the transfer
finishes.

Let's use the new feature introduced in the patch (spi: Allow SPI
devices to request the pumping thread be realtime") to request the SPI
pumping thread be realtime.  This means that if we get shunted off to
the SPI thread for whatever reason we won't get downgraded to low
priority.

NOTES:
- We still need to keep ourselves as high priority since the SPI core
  doesn't guarantee that all transfers end up on the pumping thread
  (in fact, it tries pretty hard to do them in the calling context).
- If future Chrome OS ECs ever fix themselves to be less sensitive
  then we could consider adding a property (or compatible string) to
  not set this property.  For now we need it across the board.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
---

Changes in v3:
- Updated description and variable name since we no longer force.

Changes in v2:
- Renamed variable to "force_rt_transfers".

 drivers/platform/chrome/cros_ec_spi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c
index b89bf11dda64..4f34e2215884 100644
--- a/drivers/platform/chrome/cros_ec_spi.c
+++ b/drivers/platform/chrome/cros_ec_spi.c
@@ -765,6 +765,7 @@ static int cros_ec_spi_probe(struct spi_device *spi)
 
 	spi->bits_per_word = 8;
 	spi->mode = SPI_MODE_0;
+	spi->rt = true;
 	err = spi_setup(spi);
 	if (err < 0)
 		return err;
-- 
2.21.0.1020.gf2820cf01a-goog


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers
  2019-05-14 18:39 ` [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers Douglas Anderson
@ 2019-05-14 19:34   ` Guenter Roeck
  2019-05-14 21:23     ` Doug Anderson
  0 siblings, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2019-05-14 19:34 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Mark Brown, Benson Leung, Enric Balletbo i Serra,
	open list:ARM/Rockchip SoC...,
	Nicolas Boichat, Guenter Roeck, Brian Norris, Matthias Kaehlcke,
	linux-kernel

On Tue, May 14, 2019 at 11:40 AM Douglas Anderson <dianders@chromium.org> wrote:
>
> In commit 37a186225a0c ("platform/chrome: cros_ec_spi: Transfer
> messages at high priority") we moved transfers to a high priority
> workqueue.  This helped make them much more reliable.
>
> ...but, we still saw failures.
>
> We were actually finding ourselves competing for time with dm-crypt
> which also scheduled work on HIGHPRI workqueues.  While we can
> consider reverting the change that made dm-crypt run its work at
> HIGHPRI, the argument in commit a1b89132dc4f ("dm crypt: use
> WQ_HIGHPRI for the IO and crypt workqueues") is somewhat compelling.
> It does make sense for IO to be scheduled at a priority that's higher
> than the default user priority.  It also turns out that dm-crypt isn't
> alone in using high priority like this.  loop_prepare_queue() does
> something similar for loopback devices.
>
> Looking in more detail, it can be seen that the high priority
> workqueue isn't actually that high of a priority.  It runs at MIN_NICE
> which is _fairly_ high priority but still below all real time
> priority.
>
> Should we move cros_ec_spi to real time priority to fix our problems,
> or is this just escalating a priority war?  I'll argue here that
> cros_ec_spi _does_ belong at real time priority.  Specifically
> cros_ec_spi actually needs to run quickly for correctness.  As I
> understand this is exactly what real time priority is for.
>
> There currently doesn't appear to be any way to use the standard
> workqueue APIs with a real time priority, so we'll switch over to
> using using a kthread worker.  We'll match the priority that the SPI
> core uses when it wants to do things on a realtime thread and just use
> "MAX_RT_PRIO - 1".
>
> This commit plus the patch ("platform/chrome: cros_ec_spi: Request the
> SPI thread be realtime") are enough to get communications very close
> to 100% reliable (the only known problem left is when serial console
> is turned on, which isn't something that happens in shipping devices).
> Specifically this test case now passes (tested on rk3288-veyron-jerry):
>
>   dd if=/dev/zero of=/var/log/foo.txt bs=4M count=512&
>   while true; do
>     ectool version > /dev/null;
>   done
>
> It should be noted that "/var/log" is encrypted (and goes through
> dm-crypt) and also passes through a loopback device.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
> Changes in v3:
> - cros_ec realtime patch replaces revert; now patch #1
>
> Changes in v2: None
>
>  drivers/platform/chrome/cros_ec_spi.c | 88 +++++++++++++++++++++++----
>  1 file changed, 77 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/platform/chrome/cros_ec_spi.c b/drivers/platform/chrome/cros_ec_spi.c
> index 8e9451720e73..b89bf11dda64 100644
> --- a/drivers/platform/chrome/cros_ec_spi.c
> +++ b/drivers/platform/chrome/cros_ec_spi.c
> @@ -13,6 +13,8 @@
>  #include <linux/slab.h>
>  #include <linux/spi/spi.h>
>
> +#include <uapi/linux/sched/types.h>
> +

Extra newline.

>
>  /* The header byte, which follows the preamble */
>  #define EC_MSG_HEADER                  0xec
> @@ -67,12 +69,16 @@
>   *      is sent when we want to turn on CS at the start of a transaction.
>   * @end_of_msg_delay: used to set the delay_usecs on the spi_transfer that
>   *      is sent when we want to turn off CS at the end of a transaction.
> + * @high_pri_worker: Used to give work to high_pri_thread.
> + * @high_pri_thread: We'll do our transfers here to reduce preemption problems.
>   */
>  struct cros_ec_spi {
>         struct spi_device *spi;
>         s64 last_transfer_ns;
>         unsigned int start_of_msg_delay;
>         unsigned int end_of_msg_delay;
> +       struct kthread_worker high_pri_worker;
> +       struct task_struct *high_pri_thread;
>  };
>
>  typedef int (*cros_ec_xfer_fn_t) (struct cros_ec_device *ec_dev,
> @@ -89,7 +95,7 @@ typedef int (*cros_ec_xfer_fn_t) (struct cros_ec_device *ec_dev,
>   */
>
>  struct cros_ec_xfer_work_params {
> -       struct work_struct work;
> +       struct kthread_work work;
>         cros_ec_xfer_fn_t fn;
>         struct cros_ec_device *ec_dev;
>         struct cros_ec_command *ec_msg;
> @@ -632,7 +638,7 @@ static int do_cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
>         return ret;
>  }
>
> -static void cros_ec_xfer_high_pri_work(struct work_struct *work)
> +static void cros_ec_xfer_high_pri_work(struct kthread_work *work)
>  {
>         struct cros_ec_xfer_work_params *params;
>
> @@ -644,12 +650,14 @@ static int cros_ec_xfer_high_pri(struct cros_ec_device *ec_dev,
>                                  struct cros_ec_command *ec_msg,
>                                  cros_ec_xfer_fn_t fn)
>  {
> -       struct cros_ec_xfer_work_params params;
> -
> -       INIT_WORK_ONSTACK(&params.work, cros_ec_xfer_high_pri_work);
> -       params.ec_dev = ec_dev;
> -       params.ec_msg = ec_msg;
> -       params.fn = fn;
> +       struct cros_ec_spi *ec_spi = ec_dev->priv;
> +       struct cros_ec_xfer_work_params params = {
> +               .work = KTHREAD_WORK_INIT(params.work,
> +                                         cros_ec_xfer_high_pri_work),
> +               .ec_dev = ec_dev,
> +               .ec_msg = ec_msg,
> +               .fn = fn,
> +       };
>
>         /*
>          * This looks a bit ridiculous.  Why do the work on a
> @@ -660,9 +668,8 @@ static int cros_ec_xfer_high_pri(struct cros_ec_device *ec_dev,
>          * context switched out for too long and the EC giving up on
>          * the transfer.
>          */
> -       queue_work(system_highpri_wq, &params.work);
> -       flush_work(&params.work);
> -       destroy_work_on_stack(&params.work);
> +       kthread_queue_work(&ec_spi->high_pri_worker, &params.work);
> +       kthread_flush_work(&params.work);
>
>         return params.ret;
>  }
> @@ -694,6 +701,61 @@ static void cros_ec_spi_dt_probe(struct cros_ec_spi *ec_spi, struct device *dev)
>                 ec_spi->end_of_msg_delay = val;
>  }
>
> +static void cros_ec_spi_high_pri_release(struct device *dev, void *res)
> +{
> +       struct cros_ec_spi *ec_spi = *(struct cros_ec_spi **)res;
> +
> +       kthread_stop(ec_spi->high_pri_thread);

Is that needed ? kthread_destroy_worker() does it for you.

> +       kthread_destroy_worker(&ec_spi->high_pri_worker);
> +}
> +
> +static int cros_ec_spi_devm_high_pri_alloc(struct device *dev,
> +                                          struct cros_ec_spi *ec_spi)
> +{
> +       struct sched_param sched_priority = {
> +               .sched_priority = MAX_RT_PRIO - 1,
> +       };
> +       struct cros_ec_spi **ptr;
> +       int err = 0;
> +
> +       ptr = devres_alloc(cros_ec_spi_high_pri_release, sizeof(*ptr),
> +                          GFP_KERNEL);
> +       if (!ptr)
> +               return -ENOMEM;
> +       *ptr = ec_spi;
> +
> +       kthread_init_worker(&ec_spi->high_pri_worker);
> +       ec_spi->high_pri_thread = kthread_create(kthread_worker_fn,
> +                                                &ec_spi->high_pri_worker,
> +                                                "cros_ec_spi_high_pri");
> +       if (IS_ERR(ec_spi->high_pri_thread)) {
> +               err = PTR_ERR(ec_spi->high_pri_thread);
> +               dev_err(dev, "Can't create cros_ec high pri thread: %d\n", err);
> +               goto err_worker_initted;
> +       }
> +
> +       err = sched_setscheduler_nocheck(ec_spi->high_pri_thread,
> +                                        SCHED_FIFO, &sched_priority);
> +       if (err) {
> +               dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err);
> +               goto err_thread_created;
> +       }
> +
> +       wake_up_process(ec_spi->high_pri_thread);
> +
> +       devres_add(dev, ptr);
> +

If you move that ahead of sched_setscheduler_nocheck(), you would not
need the err_thread_created: label.

> +       return 0;
> +
> +err_thread_created:
> +       kthread_stop(ec_spi->high_pri_thread);
> +
> +err_worker_initted:
> +       kthread_destroy_worker(&ec_spi->high_pri_worker);

Are you sure about this ? The worker isn't started here, but
kthread_destroy_worker() tries to stop it.

> +       devres_free(ptr);
> +       return err;
> +}
> +
>  static int cros_ec_spi_probe(struct spi_device *spi)
>  {
>         struct device *dev = &spi->dev;
> @@ -732,6 +794,10 @@ static int cros_ec_spi_probe(struct spi_device *spi)
>
>         ec_spi->last_transfer_ns = ktime_get_ns();
>
> +       err = cros_ec_spi_devm_high_pri_alloc(dev, ec_spi);
> +       if (err)
> +               return err;
> +
>         err = cros_ec_register(ec_dev);
>         if (err) {
>                 dev_err(dev, "cannot register EC\n");
> --
> 2.21.0.1020.gf2820cf01a-goog
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 2/3] spi: Allow SPI devices to request the pumping thread be realtime
  2019-05-14 18:39     ` Douglas Anderson
  (?)
@ 2019-05-14 19:44     ` Guenter Roeck
  -1 siblings, 0 replies; 8+ messages in thread
From: Guenter Roeck @ 2019-05-14 19:44 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Mark Brown, Benson Leung, Enric Balletbo i Serra,
	open list:ARM/Rockchip SoC...,
	Nicolas Boichat, Guenter Roeck, Brian Norris, Matthias Kaehlcke,
	linux-kernel, linux-spi

On Tue, May 14, 2019 at 11:40 AM Douglas Anderson <dianders@chromium.org> wrote:
>
> Right now the only way to get the SPI pumping thread bumped up to
> realtime priority is for the controller to request it.  However it may
> be that the controller works fine with the normal priority but
> communication to a particular SPI device on the bus needs realtime
> priority.
>
> Let's add a way for devices to request realtime priority when they set
> themselves up.
>
> NOTE: this will just affect the priority of transfers that end up on
> the SPI core's pumping thread.  In many cases transfers happen in the
> context of the caller so if you need realtime priority for all
> transfers you should ensure the calling context is also realtime
> priority.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

Reviewed-by: Guenter Roeck <groeck@chromium.org>

> ---
>
> Changes in v3:
> - SPI core change now like patch v1 patch #2 (with name "rt").
>
> Changes in v2:
> - Now only force transfers to the thread for devices that want it.
> - Squashed patch #1 and #2 together.
> - Renamed variable to "force_rt_transfers".
>
>  drivers/spi/spi.c       | 36 ++++++++++++++++++++++++++++++------
>  include/linux/spi/spi.h |  2 ++
>  2 files changed, 32 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 8eb7460dd744..466984796dd9 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -1364,10 +1364,32 @@ static void spi_pump_messages(struct kthread_work *work)
>         __spi_pump_messages(ctlr, true);
>  }
>
> -static int spi_init_queue(struct spi_controller *ctlr)
> +/**
> + * spi_set_thread_rt - set the controller to pump at realtime priority
> + * @ctlr: controller to boost priority of
> + *
> + * This can be called because the controller requested realtime priority
> + * (by setting the ->rt value before calling spi_register_controller()) or
> + * because a device on the bus said that its transfers needed realtime
> + * priority.
> + *
> + * NOTE: at the moment if any device on a bus says it needs realtime then
> + * the thread will be at realtime priority for all transfers on that
> + * controller.  If this eventually becomes a problem we may see if we can
> + * find a way to boost the priority only temporarily during relevant
> + * transfers.
> + */
> +static void spi_set_thread_rt(struct spi_controller *ctlr)
>  {
>         struct sched_param param = { .sched_priority = MAX_RT_PRIO - 1 };
>
> +       dev_info(&ctlr->dev,
> +               "will run message pump with realtime priority\n");
> +       sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
> +}
> +
> +static int spi_init_queue(struct spi_controller *ctlr)
> +{
>         ctlr->running = false;
>         ctlr->busy = false;
>
> @@ -1387,11 +1409,8 @@ static int spi_init_queue(struct spi_controller *ctlr)
>          * request and the scheduling of the message pump thread. Without this
>          * setting the message pump thread will remain at default priority.
>          */
> -       if (ctlr->rt) {
> -               dev_info(&ctlr->dev,
> -                       "will run message pump with realtime priority\n");
> -               sched_setscheduler(ctlr->kworker_task, SCHED_FIFO, &param);
> -       }
> +       if (ctlr->rt)
> +               spi_set_thread_rt(ctlr);
>
>         return 0;
>  }
> @@ -2982,6 +3001,11 @@ int spi_setup(struct spi_device *spi)
>
>         spi_set_cs(spi, false);
>
> +       if (spi->rt && !spi->controller->rt) {
> +               spi->controller->rt = true;
> +               spi_set_thread_rt(spi->controller);
> +       }
> +
>         dev_dbg(&spi->dev, "setup mode %d, %s%s%s%s%u bits/w, %u Hz max --> %d\n",
>                         (int) (spi->mode & (SPI_CPOL | SPI_CPHA)),
>                         (spi->mode & SPI_CS_HIGH) ? "cs_high, " : "",
> diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> index 053abd22ad31..15505c2485d6 100644
> --- a/include/linux/spi/spi.h
> +++ b/include/linux/spi/spi.h
> @@ -109,6 +109,7 @@ void spi_statistics_add_transfer_stats(struct spi_statistics *stats,
>   *     This may be changed by the device's driver, or left at the
>   *     default (0) indicating protocol words are eight bit bytes.
>   *     The spi_transfer.bits_per_word can override this for each transfer.
> + * @rt: Make the pump thread real time priority.
>   * @irq: Negative, or the number passed to request_irq() to receive
>   *     interrupts from this device.
>   * @controller_state: Controller's runtime state
> @@ -143,6 +144,7 @@ struct spi_device {
>         u32                     max_speed_hz;
>         u8                      chip_select;
>         u8                      bits_per_word;
> +       bool                    rt;
>         u32                     mode;
>  #define        SPI_CPHA        0x01                    /* clock phase */
>  #define        SPI_CPOL        0x02                    /* clock polarity */
> --
> 2.21.0.1020.gf2820cf01a-goog
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers
  2019-05-14 19:34   ` Guenter Roeck
@ 2019-05-14 21:23     ` Doug Anderson
  0 siblings, 0 replies; 8+ messages in thread
From: Doug Anderson @ 2019-05-14 21:23 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mark Brown, Benson Leung, Enric Balletbo i Serra,
	open list:ARM/Rockchip SoC...,
	Nicolas Boichat, Guenter Roeck, Brian Norris, Matthias Kaehlcke,
	linux-kernel

Hi,

On Tue, May 14, 2019 at 12:34 PM Guenter Roeck <groeck@google.com> wrote:

> On Tue, May 14, 2019 at 11:40 AM Douglas Anderson <dianders@chromium.org> wrote:
> > +static void cros_ec_spi_high_pri_release(struct device *dev, void *res)
> > +{
> > +       struct cros_ec_spi *ec_spi = *(struct cros_ec_spi **)res;
> > +
> > +       kthread_stop(ec_spi->high_pri_thread);
>
> Is that needed ? kthread_destroy_worker() does it for you.

Thanks for catching.  Removed.


> > +static int cros_ec_spi_devm_high_pri_alloc(struct device *dev,
> > +                                          struct cros_ec_spi *ec_spi)
> > +{
> > +       struct sched_param sched_priority = {
> > +               .sched_priority = MAX_RT_PRIO - 1,
> > +       };
> > +       struct cros_ec_spi **ptr;
> > +       int err = 0;
> > +
> > +       ptr = devres_alloc(cros_ec_spi_high_pri_release, sizeof(*ptr),
> > +                          GFP_KERNEL);
> > +       if (!ptr)
> > +               return -ENOMEM;
> > +       *ptr = ec_spi;
> > +
> > +       kthread_init_worker(&ec_spi->high_pri_worker);
> > +       ec_spi->high_pri_thread = kthread_create(kthread_worker_fn,
> > +                                                &ec_spi->high_pri_worker,
> > +                                                "cros_ec_spi_high_pri");
> > +       if (IS_ERR(ec_spi->high_pri_thread)) {
> > +               err = PTR_ERR(ec_spi->high_pri_thread);
> > +               dev_err(dev, "Can't create cros_ec high pri thread: %d\n", err);
> > +               goto err_worker_initted;
> > +       }
> > +
> > +       err = sched_setscheduler_nocheck(ec_spi->high_pri_thread,
> > +                                        SCHED_FIFO, &sched_priority);
> > +       if (err) {
> > +               dev_err(dev, "Can't set cros_ec high pri priority: %d\n", err);
> > +               goto err_thread_created;
> > +       }
> > +
> > +       wake_up_process(ec_spi->high_pri_thread);
> > +
> > +       devres_add(dev, ptr);
> > +
>
> If you move that ahead of sched_setscheduler_nocheck(), you would not
> need the err_thread_created: label.

Done


> > +       return 0;
> > +
> > +err_thread_created:
> > +       kthread_stop(ec_spi->high_pri_thread);
> > +
> > +err_worker_initted:
> > +       kthread_destroy_worker(&ec_spi->high_pri_worker);
>
> Are you sure about this ? The worker isn't started here, but
> kthread_destroy_worker() tries to stop it.

Right.  I was naively thinking that kthread_destroy_worker() was the
inverse of kthread_init_worker(), but clearly it's not.  :(

...and, in fact, looking closer at kthread_destroy_worker() it looks
like it's inherently slightly racy with regards to kthread_create().
Ick.  Specifically it will give a "WARN_ON" if worker->task hasn't
been set, but that doesn't get set until kthread_worker_fn runs the
first time.  Oh, but everyone's supposed to use
kthread_create_worker() to get around that!  :-)  Switching to that.
...and then of course everything looks so much cleaner!

...so after that I'm effectively implementing
devm_kthread_create_worker(), but I guess for now I'll just do that
unless someone thinks that I should try to get that landed...


I'll wait to send out a v4 till tomorrow morning to avoid spamming
with too many versions.  If anyone wants a preview feel free to look
at <https://crrev.com/c/1612165>

-Doug

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-05-14 21:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-14 18:39 [PATCH v3 0/3] spi: A better solution for cros_ec_spi reliability Douglas Anderson
2019-05-14 18:39 ` [PATCH v3 1/3] platform/chrome: cros_ec_spi: Move to real time priority for transfers Douglas Anderson
2019-05-14 19:34   ` Guenter Roeck
2019-05-14 21:23     ` Doug Anderson
     [not found] ` <20190514183935.143463-1-dianders-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
2019-05-14 18:39   ` [PATCH v3 2/3] spi: Allow SPI devices to request the pumping thread be realtime Douglas Anderson
2019-05-14 18:39     ` Douglas Anderson
2019-05-14 19:44     ` Guenter Roeck
2019-05-14 18:39 ` [PATCH v3 3/3] platform/chrome: cros_ec_spi: Request the SPI " Douglas Anderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.