linux-spi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] [PATCH v2 00/11] Optimize spi_sync path
@ 2022-06-15 12:46 David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 01/11] spi: Move ctlr->cur_msg_prepared to struct spi_message David Jander
                   ` (12 more replies)
  0 siblings, 13 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

These patches optimize the spi_sync call for the common case that the
worker thread is idle and the queue is empty. It also opens the
possibility to potentially further optimize the async path also, since
it doesn't need to take into account the direct sync path anymore.

As an example for the performance gain, on an i.MX8MM SoC with a SPI CAN
controller attached (MCP2518FD), the time the interrupt line stays
active (which corresponds roughly with the time it takes to send 3
relatively short consecutive spi_sync messages) is reduced from 98us to
only 72us by this patch.

A note about message ordering:

This patch series should not change the behavior of message ordering when
coming from the same context. This means that if a client driver issues
one or more spi_async() messages immediately followed by a spi_sync()
message in the same context, it can still rely on these messages being
sent out in the order they were fired.

---
v2:
 - Avoid API change to spi_finalize_current_message
 - Fixed NULL-pointer dereference for drivers that rely on ctlr->cur_msg
 - Removed intentional printk() statement
 - Split out into small patches to document how code is morphed
 
David Jander (11):
  spi: Move ctlr->cur_msg_prepared to struct spi_message
  spi: Don't use the message queue if possible in spi_sync
  spi: Lock controller idling transition inside the io_mutex
  spi: __spi_pump_messages: Consolidate spin_unlocks to goto target
  spi: Remove check for controller idling in spi sync path
  spi: Remove check for idling in __spi_pump_messages()
  spi: Remove the now unused ctlr->idling flag
  spi: Remove unneeded READ_ONCE for ctlr->busy flag
  spi: Set ctlr->cur_msg also in the sync transfer case
  spi: Ensure the io_mutex is held until spi_finalize_current_message()
  spi: opportunistically skip ctlr->cur_msg_completion

 drivers/spi/spi.c       | 305 ++++++++++++++++++++++++----------------
 include/linux/spi/spi.h |  24 +++-
 2 files changed, 202 insertions(+), 127 deletions(-)

-- 
2.32.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 01/11] spi: Move ctlr->cur_msg_prepared to struct spi_message
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [FRC] [PATCH v2 02/11] spi: Don't use the message queue if possible in spi_sync David Jander
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

This enables the possibility to transfer a message that is not at the
current tip of the async message queue.
This is in preparation of the next patch(es) which enable spi_sync messages
to skip the queue altogether.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c       | 7 ++++---
 include/linux/spi/spi.h | 7 ++++---
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index c78d1ceeaa42..eb6360153fa1 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1684,7 +1684,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 			spi_finalize_current_message(ctlr);
 			goto out;
 		}
-		ctlr->cur_msg_prepared = true;
+		msg->prepared = true;
 	}
 
 	ret = spi_map_msg(ctlr, msg);
@@ -1926,7 +1926,7 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 	 */
 	spi_res_release(ctlr, mesg);
 
-	if (ctlr->cur_msg_prepared && ctlr->unprepare_message) {
+	if (mesg->prepared && ctlr->unprepare_message) {
 		ret = ctlr->unprepare_message(ctlr, mesg);
 		if (ret) {
 			dev_err(&ctlr->dev, "failed to unprepare message: %d\n",
@@ -1934,9 +1934,10 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 		}
 	}
 
+	mesg->prepared = false;
+
 	spin_lock_irqsave(&ctlr->queue_lock, flags);
 	ctlr->cur_msg = NULL;
-	ctlr->cur_msg_prepared = false;
 	ctlr->fallback = false;
 	kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
 	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index c96f526d9a20..1a75c26742f2 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -385,8 +385,6 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch
  * @queue: message queue
  * @idling: the device is entering idle state
  * @cur_msg: the currently in-flight message
- * @cur_msg_prepared: spi_prepare_message was called for the currently
- *                    in-flight message
  * @cur_msg_mapped: message has been mapped for DMA
  * @last_cs: the last chip_select that is recorded by set_cs, -1 on non chip
  *           selected
@@ -621,7 +619,6 @@ struct spi_controller {
 	bool				running;
 	bool				rt;
 	bool				auto_runtime_pm;
-	bool                            cur_msg_prepared;
 	bool				cur_msg_mapped;
 	char				last_cs;
 	bool				last_cs_mode_high;
@@ -988,6 +985,7 @@ struct spi_transfer {
  * @queue: for use by whichever driver currently owns the message
  * @state: for use by whichever driver currently owns the message
  * @resources: for resource management when the spi message is processed
+ * @prepared: spi_prepare_message was called for the this message
  *
  * A @spi_message is used to execute an atomic sequence of data transfers,
  * each represented by a struct spi_transfer.  The sequence is "atomic"
@@ -1037,6 +1035,9 @@ struct spi_message {
 
 	/* list of spi_res reources when the spi message is processed */
 	struct list_head        resources;
+
+	/* spi_prepare_message was called for this message */
+	bool                    prepared;
 };
 
 static inline void spi_message_init_no_memset(struct spi_message *m)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [FRC] [PATCH v2 02/11] spi: Don't use the message queue if possible in spi_sync
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 01/11] spi: Move ctlr->cur_msg_prepared to struct spi_message David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 03/11] spi: Lock controller idling transition inside the io_mutex David Jander
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

The interaction with the controller message queue and its corresponding
auxiliary flags and variables requires the use of the queue_lock which is
costly. Since spi_sync will transfer the complete message anyway, and not
return until it is finished, there is no need to put the message into the
queue if the queue is empty. This can save a lot of overhead.

As an example of how significant this is, when using the MCP2518FD SPI CAN
controller on a i.MX8MM SoC, the time during which the interrupt line
stays active (during 3 relatively short spi_sync messages), is reduced
from 98us to 72us by this patch.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c       | 246 ++++++++++++++++++++++++----------------
 include/linux/spi/spi.h |  11 +-
 2 files changed, 159 insertions(+), 98 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index eb6360153fa1..2d057d03c4f7 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1549,6 +1549,80 @@ static void spi_idle_runtime_pm(struct spi_controller *ctlr)
 	}
 }
 
+static int __spi_pump_transfer_message(struct spi_controller *ctlr,
+		struct spi_message *msg, bool was_busy)
+{
+	struct spi_transfer *xfer;
+	int ret;
+
+	if (!was_busy && ctlr->auto_runtime_pm) {
+		ret = pm_runtime_get_sync(ctlr->dev.parent);
+		if (ret < 0) {
+			pm_runtime_put_noidle(ctlr->dev.parent);
+			dev_err(&ctlr->dev, "Failed to power device: %d\n",
+				ret);
+			return ret;
+		}
+	}
+
+	if (!was_busy)
+		trace_spi_controller_busy(ctlr);
+
+	if (!was_busy && ctlr->prepare_transfer_hardware) {
+		ret = ctlr->prepare_transfer_hardware(ctlr);
+		if (ret) {
+			dev_err(&ctlr->dev,
+				"failed to prepare transfer hardware: %d\n",
+				ret);
+
+			if (ctlr->auto_runtime_pm)
+				pm_runtime_put(ctlr->dev.parent);
+
+			msg->status = ret;
+			spi_finalize_current_message(ctlr);
+
+			return ret;
+		}
+	}
+
+	trace_spi_message_start(msg);
+
+	if (ctlr->prepare_message) {
+		ret = ctlr->prepare_message(ctlr, msg);
+		if (ret) {
+			dev_err(&ctlr->dev, "failed to prepare message: %d\n",
+				ret);
+			msg->status = ret;
+			spi_finalize_current_message(ctlr);
+			return ret;
+		}
+		msg->prepared = true;
+	}
+
+	ret = spi_map_msg(ctlr, msg);
+	if (ret) {
+		msg->status = ret;
+		spi_finalize_current_message(ctlr);
+		return ret;
+	}
+
+	if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) {
+		list_for_each_entry(xfer, &msg->transfers, transfer_list) {
+			xfer->ptp_sts_word_pre = 0;
+			ptp_read_system_prets(xfer->ptp_sts);
+		}
+	}
+
+	ret = ctlr->transfer_one_message(ctlr, msg);
+	if (ret) {
+		dev_err(&ctlr->dev,
+			"failed to transfer one message from queue\n");
+		return ret;
+	}
+
+	return 0;
+}
+
 /**
  * __spi_pump_messages - function which processes spi message queue
  * @ctlr: controller to process queue for
@@ -1564,7 +1638,6 @@ static void spi_idle_runtime_pm(struct spi_controller *ctlr)
  */
 static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 {
-	struct spi_transfer *xfer;
 	struct spi_message *msg;
 	bool was_busy = false;
 	unsigned long flags;
@@ -1599,6 +1672,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 			    !ctlr->unprepare_transfer_hardware) {
 				spi_idle_runtime_pm(ctlr);
 				ctlr->busy = false;
+				ctlr->queue_empty = true;
 				trace_spi_controller_idle(ctlr);
 			} else {
 				kthread_queue_work(ctlr->kworker,
@@ -1625,6 +1699,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 
 		spin_lock_irqsave(&ctlr->queue_lock, flags);
 		ctlr->idling = false;
+		ctlr->queue_empty = true;
 		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 		return;
 	}
@@ -1641,75 +1716,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
 	mutex_lock(&ctlr->io_mutex);
-
-	if (!was_busy && ctlr->auto_runtime_pm) {
-		ret = pm_runtime_resume_and_get(ctlr->dev.parent);
-		if (ret < 0) {
-			dev_err(&ctlr->dev, "Failed to power device: %d\n",
-				ret);
-			mutex_unlock(&ctlr->io_mutex);
-			return;
-		}
-	}
-
-	if (!was_busy)
-		trace_spi_controller_busy(ctlr);
-
-	if (!was_busy && ctlr->prepare_transfer_hardware) {
-		ret = ctlr->prepare_transfer_hardware(ctlr);
-		if (ret) {
-			dev_err(&ctlr->dev,
-				"failed to prepare transfer hardware: %d\n",
-				ret);
-
-			if (ctlr->auto_runtime_pm)
-				pm_runtime_put(ctlr->dev.parent);
-
-			msg->status = ret;
-			spi_finalize_current_message(ctlr);
-
-			mutex_unlock(&ctlr->io_mutex);
-			return;
-		}
-	}
-
-	trace_spi_message_start(msg);
-
-	if (ctlr->prepare_message) {
-		ret = ctlr->prepare_message(ctlr, msg);
-		if (ret) {
-			dev_err(&ctlr->dev, "failed to prepare message: %d\n",
-				ret);
-			msg->status = ret;
-			spi_finalize_current_message(ctlr);
-			goto out;
-		}
-		msg->prepared = true;
-	}
-
-	ret = spi_map_msg(ctlr, msg);
-	if (ret) {
-		msg->status = ret;
-		spi_finalize_current_message(ctlr);
-		goto out;
-	}
-
-	if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) {
-		list_for_each_entry(xfer, &msg->transfers, transfer_list) {
-			xfer->ptp_sts_word_pre = 0;
-			ptp_read_system_prets(xfer->ptp_sts);
-		}
-	}
-
-	ret = ctlr->transfer_one_message(ctlr, msg);
-	if (ret) {
-		dev_err(&ctlr->dev,
-			"failed to transfer one message from queue: %d\n",
-			ret);
-		goto out;
-	}
-
-out:
+	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
 	mutex_unlock(&ctlr->io_mutex);
 
 	/* Prod the scheduler in case transfer_one() was busy waiting */
@@ -1839,6 +1846,7 @@ static int spi_init_queue(struct spi_controller *ctlr)
 {
 	ctlr->running = false;
 	ctlr->busy = false;
+	ctlr->queue_empty = true;
 
 	ctlr->kworker = kthread_create_worker(0, dev_name(&ctlr->dev));
 	if (IS_ERR(ctlr->kworker)) {
@@ -1936,11 +1944,20 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 
 	mesg->prepared = false;
 
-	spin_lock_irqsave(&ctlr->queue_lock, flags);
-	ctlr->cur_msg = NULL;
-	ctlr->fallback = false;
-	kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
-	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
+	if (!mesg->sync) {
+		/*
+		 * This message was sent via the async message queue. Handle
+		 * the queue and kick the worker thread to do the
+		 * idling/shutdown or send the next message if needed.
+		 */
+		spin_lock_irqsave(&ctlr->queue_lock, flags);
+		WARN(ctlr->cur_msg != mesg,
+			"Finalizing queued message that is not the current head of queue!");
+		ctlr->cur_msg = NULL;
+		ctlr->fallback = false;
+		kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
+		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
+	}
 
 	trace_spi_message_done(mesg);
 
@@ -2043,6 +2060,7 @@ static int __spi_queued_transfer(struct spi_device *spi,
 	msg->status = -EINPROGRESS;
 
 	list_add_tail(&msg->queue, &ctlr->queue);
+	ctlr->queue_empty = false;
 	if (!ctlr->busy && need_pump)
 		kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
 
@@ -3938,6 +3956,39 @@ static int spi_async_locked(struct spi_device *spi, struct spi_message *message)
 
 }
 
+static void __spi_transfer_message_noqueue(struct spi_controller *ctlr, struct spi_message *msg)
+{
+	bool was_busy;
+	int ret;
+
+	mutex_lock(&ctlr->io_mutex);
+
+	/* If another context is idling the device then wait */
+	while (ctlr->idling)
+		usleep_range(10000, 11000);
+
+	was_busy = READ_ONCE(ctlr->busy);
+
+	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
+	if (ret)
+		goto out;
+
+	if (!was_busy) {
+		kfree(ctlr->dummy_rx);
+		ctlr->dummy_rx = NULL;
+		kfree(ctlr->dummy_tx);
+		ctlr->dummy_tx = NULL;
+		if (ctlr->unprepare_transfer_hardware &&
+		    ctlr->unprepare_transfer_hardware(ctlr))
+			dev_err(&ctlr->dev,
+				"failed to unprepare transfer hardware\n");
+		spi_idle_runtime_pm(ctlr);
+	}
+
+out:
+	mutex_unlock(&ctlr->io_mutex);
+}
+
 /*-------------------------------------------------------------------------*/
 
 /*
@@ -3956,51 +4007,52 @@ static int __spi_sync(struct spi_device *spi, struct spi_message *message)
 	DECLARE_COMPLETION_ONSTACK(done);
 	int status;
 	struct spi_controller *ctlr = spi->controller;
-	unsigned long flags;
 
 	status = __spi_validate(spi, message);
 	if (status != 0)
 		return status;
 
-	message->complete = spi_complete;
-	message->context = &done;
 	message->spi = spi;
 
 	SPI_STATISTICS_INCREMENT_FIELD(ctlr->pcpu_statistics, spi_sync);
 	SPI_STATISTICS_INCREMENT_FIELD(spi->pcpu_statistics, spi_sync);
 
 	/*
-	 * If we're not using the legacy transfer method then we will
-	 * try to transfer in the calling context so special case.
-	 * This code would be less tricky if we could remove the
-	 * support for driver implemented message queues.
+	 * Checking queue_empty here only guarantees async/sync message
+	 * ordering when coming from the same context. It does not need to
+	 * guard against reentrancy from a different context. The io_mutex
+	 * will catch those cases.
 	 */
-	if (ctlr->transfer == spi_queued_transfer) {
-		spin_lock_irqsave(&ctlr->bus_lock_spinlock, flags);
+	if (READ_ONCE(ctlr->queue_empty)) {
+		message->sync = true;
+		message->actual_length = 0;
+		message->status = -EINPROGRESS;
 
 		trace_spi_message_submit(message);
 
-		status = __spi_queued_transfer(spi, message, false);
+		SPI_STATISTICS_INCREMENT_FIELD(ctlr->pcpu_statistics, spi_sync_immediate);
+		SPI_STATISTICS_INCREMENT_FIELD(spi->pcpu_statistics, spi_sync_immediate);
 
-		spin_unlock_irqrestore(&ctlr->bus_lock_spinlock, flags);
-	} else {
-		status = spi_async_locked(spi, message);
+		__spi_transfer_message_noqueue(ctlr, message);
+
+		return message->status;
 	}
 
+	/*
+	 * There are messages in the async queue that could have originated
+	 * from the same context, so we need to preserve ordering.
+	 * Therefor we send the message to the async queue and wait until they
+	 * are completed.
+	 */
+	message->complete = spi_complete;
+	message->context = &done;
+	status = spi_async_locked(spi, message);
 	if (status == 0) {
-		/* Push out the messages in the calling context if we can */
-		if (ctlr->transfer == spi_queued_transfer) {
-			SPI_STATISTICS_INCREMENT_FIELD(ctlr->pcpu_statistics,
-						       spi_sync_immediate);
-			SPI_STATISTICS_INCREMENT_FIELD(spi->pcpu_statistics,
-						       spi_sync_immediate);
-			__spi_pump_messages(ctlr, false);
-		}
-
 		wait_for_completion(&done);
 		status = message->status;
 	}
 	message->context = NULL;
+
 	return status;
 }
 
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 1a75c26742f2..74261a83b5fa 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -461,6 +461,8 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch
  * @irq_flags: Interrupt enable state during PTP system timestamping
  * @fallback: fallback to pio if dma transfer return failure with
  *	SPI_TRANS_FAIL_NO_START.
+ * @queue_empty: signal green light for opportunistically skipping the queue
+ *	for spi_sync transfers.
  *
  * Each SPI controller can communicate with one or more @spi_device
  * children.  These make a small bus, sharing MOSI, MISO and SCK signals
@@ -677,6 +679,9 @@ struct spi_controller {
 
 	/* Interrupt enable state during PTP system timestamping */
 	unsigned long		irq_flags;
+
+	/* Flag for enabling opportunistic skipping of the queue in spi_sync */
+	bool			queue_empty;
 };
 
 static inline void *spi_controller_get_devdata(struct spi_controller *ctlr)
@@ -986,6 +991,7 @@ struct spi_transfer {
  * @state: for use by whichever driver currently owns the message
  * @resources: for resource management when the spi message is processed
  * @prepared: spi_prepare_message was called for the this message
+ * @sync: this message took the direct sync path skipping the async queue
  *
  * A @spi_message is used to execute an atomic sequence of data transfers,
  * each represented by a struct spi_transfer.  The sequence is "atomic"
@@ -1037,7 +1043,10 @@ struct spi_message {
 	struct list_head        resources;
 
 	/* spi_prepare_message was called for this message */
-	bool                    prepared;
+	bool			prepared;
+
+	/* this message is skipping the async queue */
+	bool			sync;
 };
 
 static inline void spi_message_init_no_memset(struct spi_message *m)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 03/11] spi: Lock controller idling transition inside the io_mutex
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 01/11] spi: Move ctlr->cur_msg_prepared to struct spi_message David Jander
  2022-06-15 12:46 ` [FRC] [PATCH v2 02/11] spi: Don't use the message queue if possible in spi_sync David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 04/11] spi: __spi_pump_messages: Consolidate spin_unlocks to goto target David Jander
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

This way, the spi sync path does not need to deal with the idling
transition.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 2d057d03c4f7..cfff2ff96fa0 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1643,27 +1643,30 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	unsigned long flags;
 	int ret;
 
+	/* Take the IO mutex */
+	mutex_lock(&ctlr->io_mutex);
+
 	/* Lock queue */
 	spin_lock_irqsave(&ctlr->queue_lock, flags);
 
 	/* Make sure we are not already running a message */
 	if (ctlr->cur_msg) {
 		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* If another context is idling the device then defer */
 	if (ctlr->idling) {
 		kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
 		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* Check if the queue is idle */
 	if (list_empty(&ctlr->queue) || !ctlr->running) {
 		if (!ctlr->busy) {
 			spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-			return;
+			goto out_unlock;
 		}
 
 		/* Defer any non-atomic teardown to the thread */
@@ -1679,7 +1682,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 						   &ctlr->pump_messages);
 			}
 			spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-			return;
+			goto out_unlock;
 		}
 
 		ctlr->busy = false;
@@ -1701,7 +1704,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		ctlr->idling = false;
 		ctlr->queue_empty = true;
 		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-		return;
+		goto out_unlock;
 	}
 
 	/* Extract head of queue */
@@ -1715,13 +1718,16 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		ctlr->busy = true;
 	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
-	mutex_lock(&ctlr->io_mutex);
 	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
 	mutex_unlock(&ctlr->io_mutex);
 
 	/* Prod the scheduler in case transfer_one() was busy waiting */
 	if (!ret)
 		cond_resched();
+	return;
+
+out_unlock:
+	mutex_unlock(&ctlr->io_mutex);
 }
 
 /**
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 04/11] spi: __spi_pump_messages: Consolidate spin_unlocks to goto target
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (2 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 03/11] spi: Lock controller idling transition inside the io_mutex David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 05/11] spi: Remove check for controller idling in spi sync path David Jander
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index cfff2ff96fa0..fa2d091d2854 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1650,10 +1650,8 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	spin_lock_irqsave(&ctlr->queue_lock, flags);
 
 	/* Make sure we are not already running a message */
-	if (ctlr->cur_msg) {
-		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
+	if (ctlr->cur_msg)
 		goto out_unlock;
-	}
 
 	/* If another context is idling the device then defer */
 	if (ctlr->idling) {
@@ -1664,10 +1662,8 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 
 	/* Check if the queue is idle */
 	if (list_empty(&ctlr->queue) || !ctlr->running) {
-		if (!ctlr->busy) {
-			spin_unlock_irqrestore(&ctlr->queue_lock, flags);
+		if (!ctlr->busy)
 			goto out_unlock;
-		}
 
 		/* Defer any non-atomic teardown to the thread */
 		if (!in_kthread) {
@@ -1681,7 +1677,6 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 				kthread_queue_work(ctlr->kworker,
 						   &ctlr->pump_messages);
 			}
-			spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 			goto out_unlock;
 		}
 
@@ -1703,7 +1698,6 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		spin_lock_irqsave(&ctlr->queue_lock, flags);
 		ctlr->idling = false;
 		ctlr->queue_empty = true;
-		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 		goto out_unlock;
 	}
 
@@ -1727,6 +1721,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	return;
 
 out_unlock:
+	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 	mutex_unlock(&ctlr->io_mutex);
 }
 
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 05/11] spi: Remove check for controller idling in spi sync path
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (3 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 04/11] spi: __spi_pump_messages: Consolidate spin_unlocks to goto target David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 06/11] spi: Remove check for idling in __spi_pump_messages() David Jander
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

Now that the idling flag is wholly behind the io_mutex, this broken piece
of code can be safely removed.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index fa2d091d2854..d8d2b7ac78f2 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -3964,10 +3964,6 @@ static void __spi_transfer_message_noqueue(struct spi_controller *ctlr, struct s
 
 	mutex_lock(&ctlr->io_mutex);
 
-	/* If another context is idling the device then wait */
-	while (ctlr->idling)
-		usleep_range(10000, 11000);
-
 	was_busy = READ_ONCE(ctlr->busy);
 
 	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 06/11] spi: Remove check for idling in __spi_pump_messages()
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (4 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 05/11] spi: Remove check for controller idling in spi sync path David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 07/11] spi: Remove the now unused ctlr->idling flag David Jander
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

Since the whole idling transition is locked by the io_mutex now, there is
no need to check this flag anymore.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index d8d2b7ac78f2..71b767a9ad77 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1653,13 +1653,6 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	if (ctlr->cur_msg)
 		goto out_unlock;
 
-	/* If another context is idling the device then defer */
-	if (ctlr->idling) {
-		kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
-		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-		goto out_unlock;
-	}
-
 	/* Check if the queue is idle */
 	if (list_empty(&ctlr->queue) || !ctlr->running) {
 		if (!ctlr->busy)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 07/11] spi: Remove the now unused ctlr->idling flag
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (5 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 06/11] spi: Remove check for idling in __spi_pump_messages() David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH 08/11] spi: Remove unneeded READ_ONCE for ctlr->busy flag David Jander
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

The ctlr->idling flag is never checked now, so we don't need to set it
either.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c       | 2 --
 include/linux/spi/spi.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 71b767a9ad77..52736e339645 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1674,7 +1674,6 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		}
 
 		ctlr->busy = false;
-		ctlr->idling = true;
 		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
 		kfree(ctlr->dummy_rx);
@@ -1689,7 +1688,6 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 		trace_spi_controller_idle(ctlr);
 
 		spin_lock_irqsave(&ctlr->queue_lock, flags);
-		ctlr->idling = false;
 		ctlr->queue_empty = true;
 		goto out_unlock;
 	}
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index 74261a83b5fa..c58f46be762f 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -383,7 +383,6 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch
  * @pump_messages: work struct for scheduling work to the message pump
  * @queue_lock: spinlock to syncronise access to message queue
  * @queue: message queue
- * @idling: the device is entering idle state
  * @cur_msg: the currently in-flight message
  * @cur_msg_mapped: message has been mapped for DMA
  * @last_cs: the last chip_select that is recorded by set_cs, -1 on non chip
@@ -616,7 +615,6 @@ struct spi_controller {
 	spinlock_t			queue_lock;
 	struct list_head		queue;
 	struct spi_message		*cur_msg;
-	bool				idling;
 	bool				busy;
 	bool				running;
 	bool				rt;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH 08/11] spi: Remove unneeded READ_ONCE for ctlr->busy flag
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (6 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 07/11] spi: Remove the now unused ctlr->idling flag David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 09/11] spi: Set ctlr->cur_msg also in the sync transfer case David Jander
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

Now this flag is written entirely in the mutex, so no need for READ_ONCE

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 52736e339645..29f42753ef0f 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -3955,7 +3955,7 @@ static void __spi_transfer_message_noqueue(struct spi_controller *ctlr, struct s
 
 	mutex_lock(&ctlr->io_mutex);
 
-	was_busy = READ_ONCE(ctlr->busy);
+	was_busy = ctlr->busy;
 
 	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
 	if (ret)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 09/11] spi: Set ctlr->cur_msg also in the sync transfer case
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (7 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH 08/11] spi: Remove unneeded READ_ONCE for ctlr->busy flag David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 10/11] spi: Ensure the io_mutex is held until spi_finalize_current_message() David Jander
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

Some drivers rely on this to point to the currently processed message, so
set this here also.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 29f42753ef0f..3df84f43918c 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -3957,6 +3957,7 @@ static void __spi_transfer_message_noqueue(struct spi_controller *ctlr, struct s
 
 	was_busy = ctlr->busy;
 
+	ctlr->cur_msg = msg;
 	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
 	if (ret)
 		goto out;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 10/11] spi: Ensure the io_mutex is held until spi_finalize_current_message()
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (8 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 09/11] spi: Set ctlr->cur_msg also in the sync transfer case David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 12:46 ` [RFC] [PATCH v2 11/11] spi: opportunistically skip ctlr->cur_msg_completion David Jander
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

This patch introduces a completion that is completed in
spi_finalize_current_message() and waited for in
__spi_pump_transfer_message(). This way all manipulation of ctlr->cur_msg
is done with the io_mutex held and strictly ordered:
__spi_pump_transfer_message() will not return until
spi_finalize_current_message() is done using ctlr->cur_msg, and its
calling context is only touching ctlr->cur_msg after returning.
Due to this, we can safely drop the spin-locks around ctlr->cur_msg.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c       | 32 ++++++++++++++------------------
 include/linux/spi/spi.h |  6 ++----
 2 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 3df84f43918c..db08cb868652 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1613,11 +1613,14 @@ static int __spi_pump_transfer_message(struct spi_controller *ctlr,
 		}
 	}
 
+	reinit_completion(&ctlr->cur_msg_completion);
 	ret = ctlr->transfer_one_message(ctlr, msg);
 	if (ret) {
 		dev_err(&ctlr->dev,
 			"failed to transfer one message from queue\n");
 		return ret;
+	} else {
+		wait_for_completion(&ctlr->cur_msg_completion);
 	}
 
 	return 0;
@@ -1704,6 +1707,12 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
 	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
 	ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
+
+	if (!ret)
+		kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
+	ctlr->cur_msg = NULL;
+	ctlr->fallback = false;
+
 	mutex_unlock(&ctlr->io_mutex);
 
 	/* Prod the scheduler in case transfer_one() was busy waiting */
@@ -1897,12 +1906,9 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 {
 	struct spi_transfer *xfer;
 	struct spi_message *mesg;
-	unsigned long flags;
 	int ret;
 
-	spin_lock_irqsave(&ctlr->queue_lock, flags);
 	mesg = ctlr->cur_msg;
-	spin_unlock_irqrestore(&ctlr->queue_lock, flags);
 
 	if (!ctlr->ptp_sts_supported && !ctlr->transfer_one) {
 		list_for_each_entry(xfer, &mesg->transfers, transfer_list) {
@@ -1936,20 +1942,7 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 
 	mesg->prepared = false;
 
-	if (!mesg->sync) {
-		/*
-		 * This message was sent via the async message queue. Handle
-		 * the queue and kick the worker thread to do the
-		 * idling/shutdown or send the next message if needed.
-		 */
-		spin_lock_irqsave(&ctlr->queue_lock, flags);
-		WARN(ctlr->cur_msg != mesg,
-			"Finalizing queued message that is not the current head of queue!");
-		ctlr->cur_msg = NULL;
-		ctlr->fallback = false;
-		kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
-		spin_unlock_irqrestore(&ctlr->queue_lock, flags);
-	}
+	complete(&ctlr->cur_msg_completion);
 
 	trace_spi_message_done(mesg);
 
@@ -3036,6 +3029,7 @@ int spi_register_controller(struct spi_controller *ctlr)
 	}
 	ctlr->bus_lock_flag = 0;
 	init_completion(&ctlr->xfer_completion);
+	init_completion(&ctlr->cur_msg_completion);
 	if (!ctlr->max_dma_len)
 		ctlr->max_dma_len = INT_MAX;
 
@@ -3962,6 +3956,9 @@ static void __spi_transfer_message_noqueue(struct spi_controller *ctlr, struct s
 	if (ret)
 		goto out;
 
+	ctlr->cur_msg = NULL;
+	ctlr->fallback = false;
+
 	if (!was_busy) {
 		kfree(ctlr->dummy_rx);
 		ctlr->dummy_rx = NULL;
@@ -4013,7 +4010,6 @@ static int __spi_sync(struct spi_device *spi, struct spi_message *message)
 	 * will catch those cases.
 	 */
 	if (READ_ONCE(ctlr->queue_empty)) {
-		message->sync = true;
 		message->actual_length = 0;
 		message->status = -EINPROGRESS;
 
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index c58f46be762f..c56e0d240a58 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -384,6 +384,7 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch
  * @queue_lock: spinlock to syncronise access to message queue
  * @queue: message queue
  * @cur_msg: the currently in-flight message
+ * @cur_msg_completion: a completion for the current in-flight message
  * @cur_msg_mapped: message has been mapped for DMA
  * @last_cs: the last chip_select that is recorded by set_cs, -1 on non chip
  *           selected
@@ -615,6 +616,7 @@ struct spi_controller {
 	spinlock_t			queue_lock;
 	struct list_head		queue;
 	struct spi_message		*cur_msg;
+	struct completion               cur_msg_completion;
 	bool				busy;
 	bool				running;
 	bool				rt;
@@ -989,7 +991,6 @@ struct spi_transfer {
  * @state: for use by whichever driver currently owns the message
  * @resources: for resource management when the spi message is processed
  * @prepared: spi_prepare_message was called for the this message
- * @sync: this message took the direct sync path skipping the async queue
  *
  * A @spi_message is used to execute an atomic sequence of data transfers,
  * each represented by a struct spi_transfer.  The sequence is "atomic"
@@ -1042,9 +1043,6 @@ struct spi_message {
 
 	/* spi_prepare_message was called for this message */
 	bool			prepared;
-
-	/* this message is skipping the async queue */
-	bool			sync;
 };
 
 static inline void spi_message_init_no_memset(struct spi_message *m)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC] [PATCH v2 11/11] spi: opportunistically skip ctlr->cur_msg_completion
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (9 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 10/11] spi: Ensure the io_mutex is held until spi_finalize_current_message() David Jander
@ 2022-06-15 12:46 ` David Jander
  2022-06-15 13:31 ` [RFC] [PATCH v2 00/11] Optimize spi_sync path Marc Kleine-Budde
  2022-06-16 13:22 ` Mark Brown
  12 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-15 12:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn, David Jander

There are only a few drivers that do not call
spi_finalize_current_message() in the context of transfer_one_message(),
and even for those cases the completion ctlr->cur_msg_completion is not
needed always. The calls to complete() and wait_for_completion() each
take a spin-lock, which is costly. This patch makes it possible to avoid
those calls in the big majority of cases, by introducing two flags that
with the help of ordering via barriers can avoid using the completion
safely. In case of a race with the context calling
spi_finalize_current_message(), the scheme errs on the safe side and takes
the completion.
The impact of this patch is worth the effort: On a i.MX8MM SoC, the time
the SPI bus is idle between two consecutive calls to spi_sync(), is
reduced from 19.6us to 16.8us... roughly 15%.

Signed-off-by: David Jander <david@protonic.nl>
---
 drivers/spi/spi.c       | 27 +++++++++++++++++++++++++--
 include/linux/spi/spi.h |  8 ++++++++
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index db08cb868652..ef37f043fd17 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1613,14 +1613,34 @@ static int __spi_pump_transfer_message(struct spi_controller *ctlr,
 		}
 	}
 
+	/*
+	 * Drivers implementation of transfer_one_message() must arrange for
+	 * spi_finalize_current_message() to get called. Most drivers will do
+	 * this in the calling context, but some don't. For those cases, a
+	 * completion is used to guarantee that this function does not return
+	 * until spi_finalize_current_message() is done accessing
+	 * ctlr->cur_msg.
+	 * Use of the following two flags enable to opportunistically skip the
+	 * use of the completion since its use involves expensive spin locks.
+	 * In case of a race with the context that calls
+	 * spi_finalize_current_message() the completion will always be used,
+	 * due to strict ordering of these flags using barriers.
+	 */
+	WRITE_ONCE(ctlr->cur_msg_incomplete, true);
+	WRITE_ONCE(ctlr->cur_msg_need_completion, false);
 	reinit_completion(&ctlr->cur_msg_completion);
+	smp_wmb(); /* make these available to spi_finalize_current_message */
+
 	ret = ctlr->transfer_one_message(ctlr, msg);
 	if (ret) {
 		dev_err(&ctlr->dev,
 			"failed to transfer one message from queue\n");
 		return ret;
 	} else {
-		wait_for_completion(&ctlr->cur_msg_completion);
+		WRITE_ONCE(ctlr->cur_msg_need_completion, true);
+		smp_mb(); /* see spi_finalize_current_message()... */
+		if (READ_ONCE(ctlr->cur_msg_incomplete))
+			wait_for_completion(&ctlr->cur_msg_completion);
 	}
 
 	return 0;
@@ -1942,7 +1962,10 @@ void spi_finalize_current_message(struct spi_controller *ctlr)
 
 	mesg->prepared = false;
 
-	complete(&ctlr->cur_msg_completion);
+	WRITE_ONCE(ctlr->cur_msg_incomplete, false);
+	smp_mb(); /* See __spi_pump_transfer_message()... */
+	if (READ_ONCE(ctlr->cur_msg_need_completion))
+		complete(&ctlr->cur_msg_completion);
 
 	trace_spi_message_done(mesg);
 
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index c56e0d240a58..eb0d316e3c36 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -385,6 +385,12 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch
  * @queue: message queue
  * @cur_msg: the currently in-flight message
  * @cur_msg_completion: a completion for the current in-flight message
+ * @cur_msg_incomplete: Flag used internally to opportunistically skip
+ *	the @cur_msg_completion. This flag is used to check if the driver has
+ *	already called spi_finalize_current_message().
+ * @cur_msg_need_completion: Flag used internally to opportunistically skip
+ *	the @cur_msg_completion. This flag is used to signal the context that
+ *	is running spi_finalize_current_message() that it needs to complete()
  * @cur_msg_mapped: message has been mapped for DMA
  * @last_cs: the last chip_select that is recorded by set_cs, -1 on non chip
  *           selected
@@ -617,6 +623,8 @@ struct spi_controller {
 	struct list_head		queue;
 	struct spi_message		*cur_msg;
 	struct completion               cur_msg_completion;
+	bool				cur_msg_incomplete;
+	bool				cur_msg_need_completion;
 	bool				busy;
 	bool				running;
 	bool				rt;
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (10 preceding siblings ...)
  2022-06-15 12:46 ` [RFC] [PATCH v2 11/11] spi: opportunistically skip ctlr->cur_msg_completion David Jander
@ 2022-06-15 13:31 ` Marc Kleine-Budde
  2022-06-15 14:13   ` David Jander
  2022-06-16 13:22 ` Mark Brown
  12 siblings, 1 reply; 21+ messages in thread
From: Marc Kleine-Budde @ 2022-06-15 13:31 UTC (permalink / raw)
  To: David Jander; +Cc: Mark Brown, linux-spi, Andrew Lunn

[-- Attachment #1: Type: text/plain, Size: 1354 bytes --]

On 15.06.2022 14:46:23, David Jander wrote:
> These patches optimize the spi_sync call for the common case that the
> worker thread is idle and the queue is empty. It also opens the
> possibility to potentially further optimize the async path also, since
> it doesn't need to take into account the direct sync path anymore.
> 
> As an example for the performance gain, on an i.MX8MM SoC with a SPI CAN
> controller attached (MCP2518FD), the time the interrupt line stays
> active (which corresponds roughly with the time it takes to send 3
> relatively short consecutive spi_sync messages) is reduced from 98us to
> only 72us by this patch.
> 
> A note about message ordering:
> 
> This patch series should not change the behavior of message ordering when
> coming from the same context. This means that if a client driver issues
> one or more spi_async() messages immediately followed by a spi_sync()
> message in the same context, it can still rely on these messages being
> sent out in the order they were fired.

Which git branch to use as the base?

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-15 13:31 ` [RFC] [PATCH v2 00/11] Optimize spi_sync path Marc Kleine-Budde
@ 2022-06-15 14:13   ` David Jander
  2022-06-20 18:15     ` Mark Brown
  0 siblings, 1 reply; 21+ messages in thread
From: David Jander @ 2022-06-15 14:13 UTC (permalink / raw)
  To: Marc Kleine-Budde; +Cc: Mark Brown, linux-spi, Andrew Lunn

On Wed, 15 Jun 2022 15:31:13 +0200
Marc Kleine-Budde <mkl@pengutronix.de> wrote:

> On 15.06.2022 14:46:23, David Jander wrote:
> > These patches optimize the spi_sync call for the common case that the
> > worker thread is idle and the queue is empty. It also opens the
> > possibility to potentially further optimize the async path also, since
> > it doesn't need to take into account the direct sync path anymore.
> > 
> > As an example for the performance gain, on an i.MX8MM SoC with a SPI CAN
> > controller attached (MCP2518FD), the time the interrupt line stays
> > active (which corresponds roughly with the time it takes to send 3
> > relatively short consecutive spi_sync messages) is reduced from 98us to
> > only 72us by this patch.
> > 
> > A note about message ordering:
> > 
> > This patch series should not change the behavior of message ordering when
> > coming from the same context. This means that if a client driver issues
> > one or more spi_async() messages immediately followed by a spi_sync()
> > message in the same context, it can still rely on these messages being
> > sent out in the order they were fired.  
> 
> Which git branch to use as the base?

Sorry, forgot to mention: spi for-next:
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git

This is because it relies on previous patches that have been applied there.

-- 
David Jander

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
                   ` (11 preceding siblings ...)
  2022-06-15 13:31 ` [RFC] [PATCH v2 00/11] Optimize spi_sync path Marc Kleine-Budde
@ 2022-06-16 13:22 ` Mark Brown
  2022-06-16 14:13   ` David Jander
  12 siblings, 1 reply; 21+ messages in thread
From: Mark Brown @ 2022-06-16 13:22 UTC (permalink / raw)
  To: David Jander; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn

[-- Attachment #1: Type: text/plain, Size: 513 bytes --]

On Wed, Jun 15, 2022 at 02:46:23PM +0200, David Jander wrote:
> These patches optimize the spi_sync call for the common case that the
> worker thread is idle and the queue is empty. It also opens the
> possibility to potentially further optimize the async path also, since
> it doesn't need to take into account the direct sync path anymore.

I've given this a first pass and it looks sensible so far - I'll need to
give it a more thorough look but I'd expect it should be fine.  The
numbers certainly look good.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-16 13:22 ` Mark Brown
@ 2022-06-16 14:13   ` David Jander
  2022-06-16 14:55     ` Mark Brown
  0 siblings, 1 reply; 21+ messages in thread
From: David Jander @ 2022-06-16 14:13 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn

On Thu, 16 Jun 2022 14:22:13 +0100
Mark Brown <broonie@kernel.org> wrote:

> On Wed, Jun 15, 2022 at 02:46:23PM +0200, David Jander wrote:
> > These patches optimize the spi_sync call for the common case that the
> > worker thread is idle and the queue is empty. It also opens the
> > possibility to potentially further optimize the async path also, since
> > it doesn't need to take into account the direct sync path anymore.  
> 
> I've given this a first pass and it looks sensible so far - I'll need to
> give it a more thorough look but I'd expect it should be fine.  The
> numbers certainly look good.

Thanks!
The current patch set probably needs to get partly squashed, since there are a
few patches that undo changes from a previous patch. I left them like this in
order to hopefully make the step by step mutation more clear for review.

I had some doubts about patch 11, since it introduces 2 new members to struct
spi_controller. I was trying to keep the pollution down, but I couldn't find a
better way to do this optimization. Any suggestions? Maybe a better name/place
for these flags?

Ideally this would get as much different hardware testing as possible before
going further upstream. Do you have access to some platforms suitable for
stressing SPI with multiple clients simultaneously? Known "problematic"
controllers maybe?

Looking forward to your comments.

Best regards,

-- 
David Jander

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-16 14:13   ` David Jander
@ 2022-06-16 14:55     ` Mark Brown
  2022-06-16 15:30       ` David Jander
  0 siblings, 1 reply; 21+ messages in thread
From: Mark Brown @ 2022-06-16 14:55 UTC (permalink / raw)
  To: David Jander; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn

[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]

On Thu, Jun 16, 2022 at 04:13:23PM +0200, David Jander wrote:
> Mark Brown <broonie@kernel.org> wrote:
> > On Wed, Jun 15, 2022 at 02:46:23PM +0200, David Jander wrote:

> > I've given this a first pass and it looks sensible so far - I'll need to
> > give it a more thorough look but I'd expect it should be fine.  The
> > numbers certainly look good.

> The current patch set probably needs to get partly squashed, since there are a
> few patches that undo changes from a previous patch. I left them like this in
> order to hopefully make the step by step mutation more clear for review.

Yes, there's a bit of stuff.  I think it's based off your previous
proposed patch too?

> I had some doubts about patch 11, since it introduces 2 new members to struct
> spi_controller. I was trying to keep the pollution down, but I couldn't find a
> better way to do this optimization. Any suggestions? Maybe a better name/place
> for these flags?

Not really - I'm not too concerned about individual flags since we don't
have so many SPI controllers in a system, it's not like it's a per task
overhead or similar.

> Ideally this would get as much different hardware testing as possible before
> going further upstream. Do you have access to some platforms suitable for
> stressing SPI with multiple clients simultaneously? Known "problematic"
> controllers maybe?

Well, the fastest way to get it into a wide range of CI is for me to
apply it so people who test -next will start covering it...  I was going
to kick it into my test branch KernelCI once I've got it reviewed
properly which will get at least some boot testing on a bunch of
platforms.

For testing the main thing that'd be nice for testing would probably be
coverage of controllers that don't block in transfer_one_message() and
those that complete in interrupt context while blocking in there.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-16 14:55     ` Mark Brown
@ 2022-06-16 15:30       ` David Jander
  2022-06-17 12:08         ` David Jander
  0 siblings, 1 reply; 21+ messages in thread
From: David Jander @ 2022-06-16 15:30 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn

On Thu, 16 Jun 2022 15:55:29 +0100
Mark Brown <broonie@kernel.org> wrote:

> On Thu, Jun 16, 2022 at 04:13:23PM +0200, David Jander wrote:
> > Mark Brown <broonie@kernel.org> wrote:  
> > > On Wed, Jun 15, 2022 at 02:46:23PM +0200, David Jander wrote:  
> 
> > > I've given this a first pass and it looks sensible so far - I'll need to
> > > give it a more thorough look but I'd expect it should be fine.  The
> > > numbers certainly look good.  
> 
> > The current patch set probably needs to get partly squashed, since there are a
> > few patches that undo changes from a previous patch. I left them like this in
> > order to hopefully make the step by step mutation more clear for review.  
> 
> Yes, there's a bit of stuff.  I think it's based off your previous
> proposed patch too?

Yes, in big part. I removed the API change, and all further optimizations and
improvements are done step by step on top, like your suggestion to introduce
the completion in __pump_messages and after that optimizing it further. Ideally
I should maybe have tried to split up patch 2 a bit more.

> > I had some doubts about patch 11, since it introduces 2 new members to struct
> > spi_controller. I was trying to keep the pollution down, but I couldn't find a
> > better way to do this optimization. Any suggestions? Maybe a better name/place
> > for these flags?  
> 
> Not really - I'm not too concerned about individual flags since we don't
> have so many SPI controllers in a system, it's not like it's a per task
> overhead or similar.

Ok, then we leave it as is. I was looking for a place that grouped "private"
or "internal" struct members, but couldn't fine one really. SPI drivers
looking at these wouldn't make sense I guess.

> > Ideally this would get as much different hardware testing as possible before
> > going further upstream. Do you have access to some platforms suitable for
> > stressing SPI with multiple clients simultaneously? Known "problematic"
> > controllers maybe?  
> 
> Well, the fastest way to get it into a wide range of CI is for me to
> apply it so people who test -next will start covering it...  I was going
> to kick it into my test branch KernelCI once I've got it reviewed
> properly which will get at least some boot testing on a bunch of
> platforms.

Ah, great. I will see if I can get it tested on some more other platforms from
our side.

> For testing the main thing that'd be nice for testing would probably be
> coverage of controllers that don't block in transfer_one_message() and
> those that complete in interrupt context while blocking in there.

Ah, yes, that would be ideal. spi-pl022.c and spi-axi-spi-engine.c do this
AFAIK.
Also, if someone could make some independent performance comparisons of
before/after this series and the per-cpu stats patch, that would be very
interesting. I don't like people having to trust me on my word about the
gains ;-)

Best regards,

-- 
David Jander


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-16 15:30       ` David Jander
@ 2022-06-17 12:08         ` David Jander
  0 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-17 12:08 UTC (permalink / raw)
  To: Mark Brown; +Cc: linux-spi, Marc Kleine-Budde, Andrew Lunn

On Thu, 16 Jun 2022 17:30:03 +0200
David Jander <david@protonic.nl> wrote:
> [...]
> > > Ideally this would get as much different hardware testing as possible before
> > > going further upstream. Do you have access to some platforms suitable for
> > > stressing SPI with multiple clients simultaneously? Known "problematic"
> > > controllers maybe?    
> > 
> > Well, the fastest way to get it into a wide range of CI is for me to
> > apply it so people who test -next will start covering it...  I was going
> > to kick it into my test branch KernelCI once I've got it reviewed
> > properly which will get at least some boot testing on a bunch of
> > platforms.  
> 
> Ah, great. I will see if I can get it tested on some more other platforms from
> our side.

Ok, I have managed to test kernel 5.19-rc2 with this patch set on an i.MX6DL
machine. This is an arm cortex-a9 dual core (32bit), but it does have the same
spi-imx controller that my 64-bit board has. It does have two SPI peripherals,
each on a separate bus unfortunately: A tsc2046e touchscreen controller using
the IIO ADC backend driver, and a SPI NOR flash chip.
Everything works fine. Hammering the NOR flash and the touchscreen
simultaneously works as expected. But I have some interesting performance
figures for several different kernel configurations.

First of all, the spi-imx driver seems to perform really, really bad in DMA
mode in all cases. Probably something wrong with SDMA on this board, or the
transfers are too small for DMA? The driver has two module parameters: use_dma
and polling_limit_us. Disabling DMA gets the driver to "normal" performance
levels, and setting polling_limit_us to a very big value forces the driver to
work in polling mode, i.e. to not use IRQs. Since the system isn't doing
anything apart from these tests, it is expected that forced polling mode will
likely give the highest throughput, but could be considered cheating.

Test procedure:

Frist create some continuous stress on the SPI bus:

$ (while true; do dd if=/dev/mtdblock2 of=/dev/null bs=1024; done) >/dev/null
2>&1 &

The histogram shows this:

$ for t in /sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_*; do printf "%s:\t%d\n" $t `cat $t`; done
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_0-1:        228583
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_1024-2047:  0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_128-255:    1
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_16-31:      1
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_16384-32767:        0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_2-3:        228579
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_2048-4095:  0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_256-511:    0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_32-63:      1
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_32768-65535:        0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_4-7:        1
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_4096-8191:  0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_512-1023:   228576
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_64-127:     0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_65536+:     0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_8-15:       0
/sys/bus/spi/devices/spi0.0/statistics/transfer_bytes_histo_8192-16383: 0

There are no spi_async() transfers involved. The spi-nor driver apparently
only issues sync calls in this scenario, which makes sense. Note that even
though I specified "bs=1024" to the dd command, the spi-nor driver seems to
want to do 512 byte transfers anyway. Making "bs" very small doesn't change
this either, so we cannot simulate many very small transfers this way
unfortunately. This means that performance impact is likely very limited in
this scenario since the overhead of a single transfer isn't that much compared
to the size of it.

The below data shows the CPU load distribution while running the above command
(this is the topmost line of the "top" tool I have on this board) and the
throughput of a single call to dd:

Tested kernel builds:

 1. Kernel 5.19-rc2 vanilla
 1.1. use_dma = true (default):

CPU:  0.4% usr 20.4% sys  0.0% nic  0.0% idle 77.5% io  0.0% irq  1.5% sirq
3604480 bytes (3.4MB) copied, 3.685375 seconds, 955.1KB/s

 1.2. use_dma = false:

CPU:  0.5% usr 55.9% sys  0.0% nic  0.0% idle 43.2% io  0.0% irq  0.2% sirq
3604480 bytes (3.4MB) copied, 2.017914 seconds, 1.7MB/s

 1.3. forced polling:

CPU:  0.3% usr 99.4% sys  0.0% nic  0.0% idle  0.0% io  0.0% irq  0.1% sirq
3604480 bytes (3.4MB) copied, 1.330517 seconds, 2.6MB/s

 2. Kernel 5.19-rc2 with only the per-cpu stat patch:
 2.1. use_dma = true (default):

CPU:  0.2% usr 20.1% sys  0.0% nic  0.0% idle 78.3% io  0.0% irq  1.3% sirq
3604480 bytes (3.4MB) copied, 3.682393 seconds, 955.9KB/s

 2.2. use_dma = false:

CPU:  0.5% usr 56.8% sys  0.0% nic  0.0% idle 42.3% io  0.0% irq  0.2% sirq
3604480 bytes (3.4MB) copied, 2.015509 seconds, 1.7MB/s

 2.3. forced polling:

CPU:  0.3% usr 99.4% sys  0.0% nic  0.0% idle  0.0% io  0.0% irq  0.1% sirq
3604480 bytes (3.4MB) copied, 1.319247 seconds, 2.6MB/s

 3. Kernel 5.19-rc2 with both per-cpu stats and optimized sync path patches:
 3.1. use_dma = true (default):

CPU:  0.0% usr 17.8% sys  0.0% nic  0.0% idle 81.0% io  0.0% irq  1.1% sirq
3604480 bytes (3.4MB) copied, 3.650646 seconds, 964.2KB/s

 3.2. use_dma = false:

CPU:  0.6% usr 43.8% sys  0.0% nic  0.0% idle 55.2% io  0.0% irq  0.3% sirq
3604480 bytes (3.4MB) copied, 1.971478 seconds, 1.7MB/s

 3.3. forced polling:

CPU:  0.3% usr 99.6% sys  0.0% nic  0.0% idle  0.0% io  0.0% irq  0.0% sirq
3604480 bytes (3.4MB) copied, 1.314353 seconds, 2.6MB/s

Of course I measured each value several times. Results were very consistent,
and I picked the median result for each of the cases above.
The differences are small, as is expected given the fact that data transfers
are rather big, but specially if you look at the "seconds" value returned by
"dd", there is an incremental improvement for each of the patches. The total
improvement seems to be around 1.5-2% less CPU overhead in total... which IMHO
is still remarkable given how insensitive this test is for transfer overhead.

Testing the impact on the touchscreen driver doesn't make much sense though,
since it runs on a 1MHz clock rate and is already optimized to only do very
large single SPI transfers, so the overhead will definitely be noise at best.

Will report more results if I can test on other platforms.

Best regards,

-- 
David Jander


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-15 14:13   ` David Jander
@ 2022-06-20 18:15     ` Mark Brown
  2022-06-21  6:15       ` David Jander
  0 siblings, 1 reply; 21+ messages in thread
From: Mark Brown @ 2022-06-20 18:15 UTC (permalink / raw)
  To: David Jander; +Cc: Marc Kleine-Budde, linux-spi, Andrew Lunn

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

On Wed, Jun 15, 2022 at 04:13:56PM +0200, David Jander wrote:
> Marc Kleine-Budde <mkl@pengutronix.de> wrote:

> > Which git branch to use as the base?

> Sorry, forgot to mention: spi for-next:
> git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git

> This is because it relies on previous patches that have been applied there.

It looks like something changed underneath the series unfortunately -
I'm getting failures applying even the first patch.  Can you rebase
please?  It looks good to start trying to throw at CI, ideally we'd get
more on list review but most of the active work recently has been around
the MTD stuff which is all about trying to use hardware offload for
flash operations and therefore should be minimally affected by the
actual SPI data path.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC] [PATCH v2 00/11] Optimize spi_sync path
  2022-06-20 18:15     ` Mark Brown
@ 2022-06-21  6:15       ` David Jander
  0 siblings, 0 replies; 21+ messages in thread
From: David Jander @ 2022-06-21  6:15 UTC (permalink / raw)
  To: Mark Brown; +Cc: Marc Kleine-Budde, linux-spi, Andrew Lunn

On Mon, 20 Jun 2022 19:15:13 +0100
Mark Brown <broonie@kernel.org> wrote:

> On Wed, Jun 15, 2022 at 04:13:56PM +0200, David Jander wrote:
> > Marc Kleine-Budde <mkl@pengutronix.de> wrote:  
> 
> > > Which git branch to use as the base?  
> 
> > Sorry, forgot to mention: spi for-next:
> > git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git  
> 
> > This is because it relies on previous patches that have been applied there.  
> 
> It looks like something changed underneath the series unfortunately -
> I'm getting failures applying even the first patch.  Can you rebase
> please?  It looks good to start trying to throw at CI, ideally we'd get
> more on list review but most of the active work recently has been around
> the MTD stuff which is all about trying to use hardware offload for
> flash operations and therefore should be minimally affected by the
> actual SPI data path.

I just rebased v3 onto current spi-next branch. No other changes.

Best regards,

-- 
David Jander


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2022-06-21  6:15 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-15 12:46 [RFC] [PATCH v2 00/11] Optimize spi_sync path David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 01/11] spi: Move ctlr->cur_msg_prepared to struct spi_message David Jander
2022-06-15 12:46 ` [FRC] [PATCH v2 02/11] spi: Don't use the message queue if possible in spi_sync David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 03/11] spi: Lock controller idling transition inside the io_mutex David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 04/11] spi: __spi_pump_messages: Consolidate spin_unlocks to goto target David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 05/11] spi: Remove check for controller idling in spi sync path David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 06/11] spi: Remove check for idling in __spi_pump_messages() David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 07/11] spi: Remove the now unused ctlr->idling flag David Jander
2022-06-15 12:46 ` [RFC] [PATCH 08/11] spi: Remove unneeded READ_ONCE for ctlr->busy flag David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 09/11] spi: Set ctlr->cur_msg also in the sync transfer case David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 10/11] spi: Ensure the io_mutex is held until spi_finalize_current_message() David Jander
2022-06-15 12:46 ` [RFC] [PATCH v2 11/11] spi: opportunistically skip ctlr->cur_msg_completion David Jander
2022-06-15 13:31 ` [RFC] [PATCH v2 00/11] Optimize spi_sync path Marc Kleine-Budde
2022-06-15 14:13   ` David Jander
2022-06-20 18:15     ` Mark Brown
2022-06-21  6:15       ` David Jander
2022-06-16 13:22 ` Mark Brown
2022-06-16 14:13   ` David Jander
2022-06-16 14:55     ` Mark Brown
2022-06-16 15:30       ` David Jander
2022-06-17 12:08         ` David Jander

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).