[PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements
@ 2020-06-16 10:40 Douglas Anderson
  2020-06-16 10:40 ` [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls Douglas Anderson
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Douglas Anderson @ 2020-06-16 10:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Alok Chauhan, skakit, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, Dilip Kota, linux-arm-msm, linux-kernel,
	linux-spi

This patch series is a new version of the previous patch posted:
  [PATCH v2] spi: spi-geni-qcom: Speculative fix of "nobody cared" about interrupt
  https://lore.kernel.org/r/20200317133653.v2.1.I752ebdcfd5e8bf0de06d66e767b8974932b3620e@changeid

At this point I've done enough tracing to know that there was a real
race in the old code (not just weakly ordered memory problems) and
that should be fixed with the locking patches.

While looking at this driver, I also noticed we weren't properly
noting error interrupts and also weren't actually using our FIFO
effectively, so I fixed those.

The last patch in the series addresses review feedback about dislike
for the "cur_mcmd" state variable.  It also could possibly make
"abort" work ever-so-slightly more reliably.

Changes in v3:
- ("spi: spi-geni-qcom: No need for irqsave variant...") new for v3
- Split out some lock cleanup to previous patch.
- Don't need to read IRQ status register inside spinlock.
- Don't check for state CMD_NONE; later patch is removing state var.
- Don't hold the lock for all of setup_fifo_xfer().
- Comment about why it's safe to Ack interrupts at the end.
- Subject/desc changed since race is definitely there.
- ("spi: spi-geni-qcom: Check for error IRQs") new in v3.
- ("spi: spi-geni-qcom: Actually use our FIFO") new in v3.
- ("spi: spi-geni-qcom: Don't keep a local state variable") new in v3.

Changes in v2:
- Detect true spurious interrupt.
- Still return IRQ_NONE for state machine mismatch, but print warn.

Douglas Anderson (5):
  spi: spi-geni-qcom: No need for irqsave variant of spinlock calls
  spi: spi-geni-qcom: Mo' betta locking
  spi: spi-geni-qcom: Check for error IRQs
  spi: spi-geni-qcom: Actually use our FIFO
  spi: spi-geni-qcom: Don't keep a local state variable

 drivers/spi/spi-geni-qcom.c | 120 ++++++++++++++++++++++++------------
 1 file changed, 81 insertions(+), 39 deletions(-)

-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls
  2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
@ 2020-06-16 10:40 ` Douglas Anderson
  2020-06-17 20:24   ` Stephen Boyd
  2020-06-16 10:40 ` [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking Douglas Anderson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Douglas Anderson @ 2020-06-16 10:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Alok Chauhan, skakit, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

The driver locks its locks in two places.

In the first usage of the lock the function doing the locking already
has a sleeping call and thus we know we can't be called from interrupt
context.  That means we can use the "spin_lock_irq" variant of the
function.

In the second usage of the lock the function is the interrupt handler
and we know interrupt handlers are called with interrupts disabled.
That means we can use the "spin_lock" variant of the function.

This patch is expected to be a no-op and is just a cleanup / slight
optimization.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- ("spi: spi-geni-qcom: No need for irqsave variant...") new for v3

Changes in v2: None

 drivers/spi/spi-geni-qcom.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index c3972424af71..c7d2c7e45b3f 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -122,23 +122,23 @@ static void handle_fifo_timeout(struct spi_master *spi,
 				struct spi_message *msg)
 {
 	struct spi_geni_master *mas = spi_master_get_devdata(spi);
-	unsigned long time_left, flags;
+	unsigned long time_left;
 	struct geni_se *se = &mas->se;
 
-	spin_lock_irqsave(&mas->lock, flags);
+	spin_lock_irq(&mas->lock);
 	reinit_completion(&mas->xfer_done);
 	mas->cur_mcmd = CMD_CANCEL;
 	geni_se_cancel_m_cmd(se);
 	writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
-	spin_unlock_irqrestore(&mas->lock, flags);
+	spin_unlock_irq(&mas->lock);
 	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
 	if (time_left)
 		return;
 
-	spin_lock_irqsave(&mas->lock, flags);
+	spin_lock_irq(&mas->lock);
 	reinit_completion(&mas->xfer_done);
 	geni_se_abort_m_cmd(se);
-	spin_unlock_irqrestore(&mas->lock, flags);
+	spin_unlock_irq(&mas->lock);
 	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
 	if (!time_left)
 		dev_err(mas->dev, "Failed to cancel/abort m_cmd\n");
@@ -477,12 +477,11 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 	struct spi_geni_master *mas = spi_master_get_devdata(spi);
 	struct geni_se *se = &mas->se;
 	u32 m_irq;
-	unsigned long flags;
 
 	if (mas->cur_mcmd == CMD_NONE)
 		return IRQ_NONE;
 
-	spin_lock_irqsave(&mas->lock, flags);
+	spin_lock(&mas->lock);
 	m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
 
 	if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq & M_RX_FIFO_LAST_EN))
@@ -524,7 +523,7 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 	}
 
 	writel(m_irq, se->base + SE_GENI_M_IRQ_CLEAR);
-	spin_unlock_irqrestore(&mas->lock, flags);
+	spin_unlock(&mas->lock);
 	return IRQ_HANDLED;
 }
 
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking
  2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
  2020-06-16 10:40 ` [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls Douglas Anderson
@ 2020-06-16 10:40 ` Douglas Anderson
  2020-06-17 20:53   ` Stephen Boyd
  2020-06-16 10:40 ` [PATCH v3 3/5] spi: spi-geni-qcom: Check for error IRQs Douglas Anderson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Douglas Anderson @ 2020-06-16 10:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Alok Chauhan, skakit, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, Dilip Kota, linux-arm-msm, linux-kernel,
	linux-spi

If you added a bit of a delay (like a trace_printk) into the ISR for
the spi-geni-qcom driver, you would suddenly start seeing some errors
spit out.  The problem was that, though the ISR itself held a lock,
other parts of the driver didn't always grab the lock.

One example race was this:
a) Driver queues off a command to set a Chip Select (CS).
b) ISR fires indicating the CS is done.
c) Done bit is set, so we complete().
d) Second CPU gallops off and starts a transfer.
e) Second CPU starts messing with hardware / software state (not under
   spinlock).
f) ISR now does things like set "mas->cur_mcmd" to CMD_NONE, prints
   errors if "tx_rem_bytes" / "rx_rem_bytes" have been set, and also
   Acks all interrupts it handled.

Let's fix this.  Before we start messing with hardware, we'll grab the
lock to make sure that the IRQ handler from some previous command has
really finished.  We don't need to hold the lock unless we're in a
state where more interrupts can come in, but we at least need to make
sure the previous IRQ is done.  This lock is used exclusively to
prevent the IRQ handler and non-IRQ from stomping on each other.  The
SPI core handles all other mutual exclusion.

As part of this, we change the way that the IRQ handler detects
spurious interrupts.  Previously we checked for our state variable
being set to IRQ_NONE, but that was done outside the spinlock.  We
could move it into the spinlock, but instead let's just change it to
look for the lack of any IRQ status bits being set.  This can be done
outside the lock--the hardware certainly isn't grabbing or looking at
the spinlock when it updates its status register.

It's possible that this will fix real (but very rare) errors seen in
the field that look like:
  irq ...: nobody cared (try booting with the "irqpoll" option)

NOTE: an alternate strategy considered here was to always make the
complete() / spi_finalize_current_transfer() the very last thing in
our IRQ handler.  With such a change you could consider that we could
be "lockless".  In that case, though, we'd have to be very careful w/
memory barriers so we made sure we didn't have any bugs with weakly
ordered memory.  Using spinlocks makes the driver much easier to
understand.

Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- Split out some lock cleanup to previous patch.
- Don't need to read IRQ status register inside spinlock.
- Don't check for state CMD_NONE; later patch is removing state var.
- Don't hold the lock for all of setup_fifo_xfer().
- Comment about why it's safe to Ack interrupts at the end.
- Subject/desc changed since race is definitely there.

Changes in v2:
- Detect true spurious interrupt.
- Still return IRQ_NONE for state machine mismatch, but print warn.

 drivers/spi/spi-geni-qcom.c | 45 ++++++++++++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index c7d2c7e45b3f..e0f0e5c241f3 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -151,16 +151,18 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
 	struct geni_se *se = &mas->se;
 	unsigned long time_left;
 
-	reinit_completion(&mas->xfer_done);
 	pm_runtime_get_sync(mas->dev);
 	if (!(slv->mode & SPI_CS_HIGH))
 		set_flag = !set_flag;
 
+	spin_lock_irq(&mas->lock);
+	reinit_completion(&mas->xfer_done);
 	mas->cur_mcmd = CMD_CS;
 	if (set_flag)
 		geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
 	else
 		geni_se_setup_m_cmd(se, SPI_CS_DEASSERT, 0);
+	spin_unlock_irq(&mas->lock);
 
 	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
 	if (!time_left)
@@ -307,6 +309,21 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	u32 spi_tx_cfg, len;
 	struct geni_se *se = &mas->se;
 
+	/*
+	 * Ensure that our interrupt handler isn't still running from some
+	 * prior command before we start messing with the hardware behind
+	 * its back.  We don't need to _keep_ the lock here since we're only
+	 * worried about racing with out interrupt handler.  The SPI core
+	 * already handles making sure that we're not trying to do two
+	 * transfers at once or setting a chip select and doing a transfer
+	 * concurrently.
+	 *
+	 * NOTE: we actually _can't_ hold the lock here because possibly we
+	 * might call clk_set_rate() which needs to be able to sleep.
+	 */
+	spin_lock_irq(&mas->lock);
+	spin_unlock_irq(&mas->lock);
+
 	spi_tx_cfg = readl(se->base + SE_SPI_TRANS_CFG);
 	if (xfer->bits_per_word != mas->cur_bits_per_word) {
 		spi_setup_word_len(mas, mode, xfer->bits_per_word);
@@ -367,6 +384,12 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	}
 	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
 	mas->cur_mcmd = CMD_XFER;
+
+	/*
+	 * Lock around right before we start the transfer since our
+	 * interrupt controller could come in at any time now.
+	 */
+	spin_lock_irq(&mas->lock);
 	geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
 
 	/*
@@ -376,6 +399,7 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 	 */
 	if (m_cmd & SPI_TX_ONLY)
 		writel(mas->tx_wm, se->base + SE_GENI_TX_WATERMARK_REG);
+	spin_unlock_irq(&mas->lock);
 }
 
 static int spi_geni_transfer_one(struct spi_master *spi,
@@ -478,11 +502,11 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 	struct geni_se *se = &mas->se;
 	u32 m_irq;
 
-	if (mas->cur_mcmd == CMD_NONE)
+	m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
+	if (!m_irq)
 		return IRQ_NONE;
 
 	spin_lock(&mas->lock);
-	m_irq = readl(se->base + SE_GENI_M_IRQ_STATUS);
 
 	if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq & M_RX_FIFO_LAST_EN))
 		geni_spi_handle_rx(mas);
@@ -522,8 +546,23 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 		complete(&mas->xfer_done);
 	}
 
+	/*
+	 * It's safe or a good idea to Ack all of our our interrupts at the
+	 * end of the function. Specifically:
+	 * - M_CMD_DONE_EN / M_RX_FIFO_LAST_EN: Edge triggered interrupts and
+	 *   clearing Acks. Clearing at the end relies on nobody else having
+	 *   started a new transfer yet or else we could be clearing _their_
+	 *   done bit, but everyone grabs the spinlock before starting a new
+	 *   transfer.
+	 * - M_RX_FIFO_WATERMARK_EN / M_TX_FIFO_WATERMARK_EN: These appear
+	 *   to be "latched level" interrupts so it's important to clear them
+	 *   _after_ you've handled the condition and always safe to do so
+	 *   since they'll re-assert if they're still happening.
+	 */
 	writel(m_irq, se->base + SE_GENI_M_IRQ_CLEAR);
+
 	spin_unlock(&mas->lock);
+
 	return IRQ_HANDLED;
 }
 
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/5] spi: spi-geni-qcom: Check for error IRQs
  2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
  2020-06-16 10:40 ` [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls Douglas Anderson
  2020-06-16 10:40 ` [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking Douglas Anderson
@ 2020-06-16 10:40 ` Douglas Anderson
  2020-06-17 20:54   ` Stephen Boyd
  2020-06-16 10:40 ` [PATCH v3 4/5] spi: spi-geni-qcom: Actually use our FIFO Douglas Anderson
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Douglas Anderson @ 2020-06-16 10:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Alok Chauhan, skakit, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

From reading the #defines it seems like we should shout if we ever see
one of these error bits.  Let's do so.  This doesn't do anything
functional except print a yell in the log if the error bits are seen.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- ("spi: spi-geni-qcom: Check for error IRQs") new in v3.

Changes in v2: None

 drivers/spi/spi-geni-qcom.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index e0f0e5c241f3..56628d45421e 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -506,6 +506,11 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 	if (!m_irq)
 		return IRQ_NONE;
 
+	if (m_irq & (M_CMD_OVERRUN_EN | M_ILLEGAL_CMD_EN | M_CMD_FAILURE_EN |
+		     M_RX_FIFO_RD_ERR_EN | M_RX_FIFO_WR_ERR_EN |
+		     M_TX_FIFO_RD_ERR_EN | M_TX_FIFO_WR_ERR_EN))
+		dev_warn(mas->dev, "Unexpected IRQ err status %#010x\n", m_irq);
+
 	spin_lock(&mas->lock);
 
 	if ((m_irq & M_RX_FIFO_WATERMARK_EN) || (m_irq & M_RX_FIFO_LAST_EN))
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 4/5] spi: spi-geni-qcom: Actually use our FIFO
  2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
                   ` (2 preceding siblings ...)
  2020-06-16 10:40 ` [PATCH v3 3/5] spi: spi-geni-qcom: Check for error IRQs Douglas Anderson
@ 2020-06-16 10:40 ` Douglas Anderson
  2020-06-17 20:56   ` Stephen Boyd
  2020-06-16 10:40 ` [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable Douglas Anderson
  2020-06-18 16:48 ` [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Mark Brown
  5 siblings, 1 reply; 15+ messages in thread
From: Douglas Anderson @ 2020-06-16 10:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Alok Chauhan, skakit, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

The geni hardware has a FIFO that can hold up to 64 bytes (it has 16
entries that can hold 4 bytes each), at least on the two SoCs I tested
(sdm845 and sc7180).  We configured our RX Watermark to 0, which
basically meant we got an interrupt as soon as the first 4 bytes
showed up in the FIFO.  Tracing the IRQ handler showed that we often
only read 4 or 8 bytes per IRQ handler.

I tried setting the RX Watermark to "fifo size - 2" but that just got
me a bunch of overrun errors reported.  Setting it to "fifo size - 3"
seemed to work great, though.  This made me worried that we'd start
getting overruns if we had long interrupt latency, but that doesn't
appear to be the case and delays inserted in the IRQ handler while
using "fifo size - 3" didn't cause any errors.  Presumably there is
some interaction with the poorly-documented RFR (ready for receive)
level means that "fifo size - 3" is the max.  We are the SPI master,
so it makes sense that there would be no problems with overruns, the
master should just stop clocking.

Despite "fifo size - 3" working, I chose "fifo size / 2" (8 entries =
32 bytes) which gives us a little extra time to get to the interrupt
handler and should reduce dead time on the SPI wires.  With this
setting, I often saw the IRQ handler handle 40 bytes but sometimes up
to 56 if we had bad interrupt latency.

Testing by running "flashrom -p ec -r" on a Chromebook saw interrupts
from the SPI driver cut roughly in half.  Time was roughly the same.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- ("spi: spi-geni-qcom: Actually use our FIFO") new in v3.

Changes in v2: None

 drivers/spi/spi-geni-qcom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 56628d45421e..63a62548b078 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -285,7 +285,7 @@ static int spi_geni_init(struct spi_geni_master *mas)
 	 * Hardware programming guide suggests to configure
 	 * RX FIFO RFR level to fifo_depth-2.
 	 */
-	geni_se_init(se, 0x0, mas->tx_fifo_depth - 2);
+	geni_se_init(se, mas->tx_fifo_depth / 2, mas->tx_fifo_depth - 2);
 	/* Transmit an entire FIFO worth of data per IRQ */
 	mas->tx_wm = 1;
 	ver = geni_se_get_qup_hw_version(se);
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable
  2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
                   ` (3 preceding siblings ...)
  2020-06-16 10:40 ` [PATCH v3 4/5] spi: spi-geni-qcom: Actually use our FIFO Douglas Anderson
@ 2020-06-16 10:40 ` Douglas Anderson
  2020-06-17 21:16   ` Stephen Boyd
  2020-06-18 16:48 ` [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Mark Brown
  5 siblings, 1 reply; 15+ messages in thread
From: Douglas Anderson @ 2020-06-16 10:40 UTC (permalink / raw)
  To: Mark Brown
  Cc: Alok Chauhan, skakit, swboyd, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

The variable "cur_mcmd" kept track of our current state (idle, xfer,
cs, cancel).  We don't really need it, so get rid of it.  Instead:
* Use separate condition variables for "chip select done", "cancel
  done", and "abort done".  This is important so that if a "done"
  comes through (perhaps some previous interrupt finally came through)
  it can't confuse the cancel/abort function.
* Use the "done" interrupt only for when a chip select or transfer is
  done and we can tell the difference by looking at whether "cur_xfer"
  is NULL.

This is mostly a no-op change.  However, it is possible it could fix
an issue where a super delayed interrupt for a cancel command could
have confused our waiting for an abort command.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v3:
- ("spi: spi-geni-qcom: Don't keep a local state variable") new in v3.

Changes in v2: None

 drivers/spi/spi-geni-qcom.c | 55 ++++++++++++++++++-------------------
 1 file changed, 27 insertions(+), 28 deletions(-)

diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
index 63a62548b078..6feea88d63ac 100644
--- a/drivers/spi/spi-geni-qcom.c
+++ b/drivers/spi/spi-geni-qcom.c
@@ -63,13 +63,6 @@
 #define TIMESTAMP_AFTER		BIT(3)
 #define POST_CMD_DELAY		BIT(4)
 
-enum spi_m_cmd_opcode {
-	CMD_NONE,
-	CMD_XFER,
-	CMD_CS,
-	CMD_CANCEL,
-};
-
 struct spi_geni_master {
 	struct geni_se se;
 	struct device *dev;
@@ -81,10 +74,11 @@ struct spi_geni_master {
 	unsigned int tx_rem_bytes;
 	unsigned int rx_rem_bytes;
 	const struct spi_transfer *cur_xfer;
-	struct completion xfer_done;
+	struct completion cs_done;
+	struct completion cancel_done;
+	struct completion abort_done;
 	unsigned int oversampling;
 	spinlock_t lock;
-	enum spi_m_cmd_opcode cur_mcmd;
 	int irq;
 };
 
@@ -126,20 +120,23 @@ static void handle_fifo_timeout(struct spi_master *spi,
 	struct geni_se *se = &mas->se;
 
 	spin_lock_irq(&mas->lock);
-	reinit_completion(&mas->xfer_done);
-	mas->cur_mcmd = CMD_CANCEL;
-	geni_se_cancel_m_cmd(se);
+	reinit_completion(&mas->cancel_done);
 	writel(0, se->base + SE_GENI_TX_WATERMARK_REG);
+	mas->cur_xfer = NULL;
+	mas->tx_rem_bytes = mas->rx_rem_bytes = 0;
+	geni_se_cancel_m_cmd(se);
 	spin_unlock_irq(&mas->lock);
-	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
+
+	time_left = wait_for_completion_timeout(&mas->cancel_done, HZ);
 	if (time_left)
 		return;
 
 	spin_lock_irq(&mas->lock);
-	reinit_completion(&mas->xfer_done);
+	reinit_completion(&mas->abort_done);
 	geni_se_abort_m_cmd(se);
 	spin_unlock_irq(&mas->lock);
-	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
+
+	time_left = wait_for_completion_timeout(&mas->abort_done, HZ);
 	if (!time_left)
 		dev_err(mas->dev, "Failed to cancel/abort m_cmd\n");
 }
@@ -156,15 +153,14 @@ static void spi_geni_set_cs(struct spi_device *slv, bool set_flag)
 		set_flag = !set_flag;
 
 	spin_lock_irq(&mas->lock);
-	reinit_completion(&mas->xfer_done);
-	mas->cur_mcmd = CMD_CS;
+	reinit_completion(&mas->cs_done);
 	if (set_flag)
 		geni_se_setup_m_cmd(se, SPI_CS_ASSERT, 0);
 	else
 		geni_se_setup_m_cmd(se, SPI_CS_DEASSERT, 0);
 	spin_unlock_irq(&mas->lock);
 
-	time_left = wait_for_completion_timeout(&mas->xfer_done, HZ);
+	time_left = wait_for_completion_timeout(&mas->cs_done, HZ);
 	if (!time_left)
 		handle_fifo_timeout(spi, NULL);
 
@@ -383,7 +379,6 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
 		mas->rx_rem_bytes = xfer->len;
 	}
 	writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
-	mas->cur_mcmd = CMD_XFER;
 
 	/*
 	 * Lock around right before we start the transfer since our
@@ -520,11 +515,13 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 		geni_spi_handle_tx(mas);
 
 	if (m_irq & M_CMD_DONE_EN) {
-		if (mas->cur_mcmd == CMD_XFER)
+		if (mas->cur_xfer) {
 			spi_finalize_current_transfer(spi);
-		else if (mas->cur_mcmd == CMD_CS)
-			complete(&mas->xfer_done);
-		mas->cur_mcmd = CMD_NONE;
+			mas->cur_xfer = NULL;
+		} else {
+			complete(&mas->cs_done);
+		}
+
 		/*
 		 * If this happens, then a CMD_DONE came before all the Tx
 		 * buffer bytes were sent out. This is unusual, log this
@@ -546,10 +543,10 @@ static irqreturn_t geni_spi_isr(int irq, void *data)
 				mas->rx_rem_bytes, mas->cur_bits_per_word);
 	}
 
-	if ((m_irq & M_CMD_CANCEL_EN) || (m_irq & M_CMD_ABORT_EN)) {
-		mas->cur_mcmd = CMD_NONE;
-		complete(&mas->xfer_done);
-	}
+	if (m_irq & M_CMD_CANCEL_EN)
+		complete(&mas->cancel_done);
+	if (m_irq & M_CMD_ABORT_EN)
+		complete(&mas->abort_done);
 
 	/*
 	 * It's safe or a good idea to Ack all of our our interrupts at the
@@ -617,7 +614,9 @@ static int spi_geni_probe(struct platform_device *pdev)
 	spi->handle_err = handle_fifo_timeout;
 	spi->set_cs = spi_geni_set_cs;
 
-	init_completion(&mas->xfer_done);
+	init_completion(&mas->cs_done);
+	init_completion(&mas->cancel_done);
+	init_completion(&mas->abort_done);
 	spin_lock_init(&mas->lock);
 	pm_runtime_enable(dev);
 
-- 
2.27.0.290.gba653c62da-goog


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls
  2020-06-16 10:40 ` [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls Douglas Anderson
@ 2020-06-17 20:24   ` Stephen Boyd
  0 siblings, 0 replies; 15+ messages in thread
From: Stephen Boyd @ 2020-06-17 20:24 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: Alok Chauhan, skakit, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

Quoting Douglas Anderson (2020-06-16 03:40:46)
> The driver locks its locks in two places.
> 
> In the first usage of the lock the function doing the locking already
> has a sleeping call and thus we know we can't be called from interrupt
> context.  That means we can use the "spin_lock_irq" variant of the
> function.
> 
> In the second usage of the lock the function is the interrupt handler
> and we know interrupt handlers are called with interrupts disabled.
> That means we can use the "spin_lock" variant of the function.
> 
> This patch is expected to be a no-op and is just a cleanup / slight
> optimization.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Stephen Boyd <swboyd@chromium.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking
  2020-06-16 10:40 ` [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking Douglas Anderson
@ 2020-06-17 20:53   ` Stephen Boyd
  2020-06-17 21:19     ` Doug Anderson
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen Boyd @ 2020-06-17 20:53 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: Alok Chauhan, skakit, Douglas Anderson, Andy Gross,
	Bjorn Andersson, Dilip Kota, linux-arm-msm, linux-kernel,
	linux-spi

Quoting Douglas Anderson (2020-06-16 03:40:47)
> If you added a bit of a delay (like a trace_printk) into the ISR for
> the spi-geni-qcom driver, you would suddenly start seeing some errors
> spit out.  The problem was that, though the ISR itself held a lock,
> other parts of the driver didn't always grab the lock.
> 
> One example race was this:
> a) Driver queues off a command to set a Chip Select (CS).
> b) ISR fires indicating the CS is done.
> c) Done bit is set, so we complete().
> d) Second CPU gallops off and starts a transfer.
> e) Second CPU starts messing with hardware / software state (not under
>    spinlock).
> f) ISR now does things like set "mas->cur_mcmd" to CMD_NONE, prints
>    errors if "tx_rem_bytes" / "rx_rem_bytes" have been set, and also
>    Acks all interrupts it handled.

Can we get a CPU0/CPU1 diagram here? At point e) I got sort of lost. And
maybe it's not even a dual CPU problem? i.e. it can happen on one CPU?

    CPU0
    ----
 a) spi_geni_set_cs()
     mas->cur_mcmd = CMD_CS
     wait_for_completion_timeout(&xfer_done)
 b)  <INTERRUPT>
     geni_spi_isr()
 c)   complete(&xfer_done);
     <END INTERRUPT>
     pm_runtime_put(mas->dev);
 d) galloping?

I got lost... Sorry!
    
> 
> Let's fix this.  Before we start messing with hardware, we'll grab the
> lock to make sure that the IRQ handler from some previous command has
> really finished.  We don't need to hold the lock unless we're in a
> state where more interrupts can come in, but we at least need to make
> sure the previous IRQ is done.  This lock is used exclusively to
> prevent the IRQ handler and non-IRQ from stomping on each other.  The
> SPI core handles all other mutual exclusion.

Ok maybe the problem is the completion at c) never happens until the
wait_for_completion_timeout() times out?

> 
> As part of this, we change the way that the IRQ handler detects
> spurious interrupts.  Previously we checked for our state variable
> being set to IRQ_NONE, but that was done outside the spinlock.  We
> could move it into the spinlock, but instead let's just change it to
> look for the lack of any IRQ status bits being set.  This can be done
> outside the lock--the hardware certainly isn't grabbing or looking at
> the spinlock when it updates its status register.
> 
> It's possible that this will fix real (but very rare) errors seen in
> the field that look like:
>   irq ...: nobody cared (try booting with the "irqpoll" option)
> 
> NOTE: an alternate strategy considered here was to always make the
> complete() / spi_finalize_current_transfer() the very last thing in
> our IRQ handler.  With such a change you could consider that we could
> be "lockless".  In that case, though, we'd have to be very careful w/
> memory barriers so we made sure we didn't have any bugs with weakly
> ordered memory.  Using spinlocks makes the driver much easier to
> understand.
> 
> Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
> Changes in v3:
> - Split out some lock cleanup to previous patch.
> - Don't need to read IRQ status register inside spinlock.
> - Don't check for state CMD_NONE; later patch is removing state var.
> - Don't hold the lock for all of setup_fifo_xfer().
> - Comment about why it's safe to Ack interrupts at the end.
> - Subject/desc changed since race is definitely there.
> 
> Changes in v2:
> - Detect true spurious interrupt.
> - Still return IRQ_NONE for state machine mismatch, but print warn.
> 
>  drivers/spi/spi-geni-qcom.c | 45 ++++++++++++++++++++++++++++++++++---
>  1 file changed, 42 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index c7d2c7e45b3f..e0f0e5c241f3 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -367,6 +384,12 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
>         }
>         writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
>         mas->cur_mcmd = CMD_XFER;
> +
> +       /*
> +        * Lock around right before we start the transfer since our
> +        * interrupt controller could come in at any time now.

drop 'controller'?

> +        */
> +       spin_lock_irq(&mas->lock);
>         geni_se_setup_m_cmd(se, m_cmd, FRAGMENTATION);
>  
>         /*

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 3/5] spi: spi-geni-qcom: Check for error IRQs
  2020-06-16 10:40 ` [PATCH v3 3/5] spi: spi-geni-qcom: Check for error IRQs Douglas Anderson
@ 2020-06-17 20:54   ` Stephen Boyd
  0 siblings, 0 replies; 15+ messages in thread
From: Stephen Boyd @ 2020-06-17 20:54 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: Alok Chauhan, skakit, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

Quoting Douglas Anderson (2020-06-16 03:40:48)
> From reading the #defines it seems like we should shout if we ever see
> one of these error bits.  Let's do so.  This doesn't do anything
> functional except print a yell in the log if the error bits are seen.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Stephen Boyd <swboyd@chromium.org>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 4/5] spi: spi-geni-qcom: Actually use our FIFO
  2020-06-16 10:40 ` [PATCH v3 4/5] spi: spi-geni-qcom: Actually use our FIFO Douglas Anderson
@ 2020-06-17 20:56   ` Stephen Boyd
  0 siblings, 0 replies; 15+ messages in thread
From: Stephen Boyd @ 2020-06-17 20:56 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: Alok Chauhan, skakit, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

Quoting Douglas Anderson (2020-06-16 03:40:49)
> The geni hardware has a FIFO that can hold up to 64 bytes (it has 16
> entries that can hold 4 bytes each), at least on the two SoCs I tested
> (sdm845 and sc7180).  We configured our RX Watermark to 0, which
> basically meant we got an interrupt as soon as the first 4 bytes
> showed up in the FIFO.  Tracing the IRQ handler showed that we often
> only read 4 or 8 bytes per IRQ handler.
> 
> I tried setting the RX Watermark to "fifo size - 2" but that just got
> me a bunch of overrun errors reported.  Setting it to "fifo size - 3"
> seemed to work great, though.  This made me worried that we'd start
> getting overruns if we had long interrupt latency, but that doesn't
> appear to be the case and delays inserted in the IRQ handler while
> using "fifo size - 3" didn't cause any errors.  Presumably there is
> some interaction with the poorly-documented RFR (ready for receive)
> level means that "fifo size - 3" is the max.  We are the SPI master,
> so it makes sense that there would be no problems with overruns, the
> master should just stop clocking.
> 
> Despite "fifo size - 3" working, I chose "fifo size / 2" (8 entries =
> 32 bytes) which gives us a little extra time to get to the interrupt
> handler and should reduce dead time on the SPI wires.  With this
> setting, I often saw the IRQ handler handle 40 bytes but sometimes up
> to 56 if we had bad interrupt latency.
> 
> Testing by running "flashrom -p ec -r" on a Chromebook saw interrupts
> from the SPI driver cut roughly in half.  Time was roughly the same.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---

Reviewed-by: Stephen Boyd <swboyd@chromium.org>

Nice improvement. Maybe it can still have a Fixes tag because it's a
performance problem?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable
  2020-06-16 10:40 ` [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable Douglas Anderson
@ 2020-06-17 21:16   ` Stephen Boyd
  2020-06-18 15:05     ` Doug Anderson
  0 siblings, 1 reply; 15+ messages in thread
From: Stephen Boyd @ 2020-06-17 21:16 UTC (permalink / raw)
  To: Douglas Anderson, Mark Brown
  Cc: Alok Chauhan, skakit, Douglas Anderson, Andy Gross,
	Bjorn Andersson, linux-arm-msm, linux-kernel, linux-spi

Quoting Douglas Anderson (2020-06-16 03:40:50)
> diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> index 63a62548b078..6feea88d63ac 100644
> --- a/drivers/spi/spi-geni-qcom.c
> +++ b/drivers/spi/spi-geni-qcom.c
> @@ -63,13 +63,6 @@
>  #define TIMESTAMP_AFTER                BIT(3)
>  #define POST_CMD_DELAY         BIT(4)
>  
> -enum spi_m_cmd_opcode {
> -       CMD_NONE,
> -       CMD_XFER,
> -       CMD_CS,
> -       CMD_CANCEL,
> -};
> -
>  struct spi_geni_master {
>         struct geni_se se;
>         struct device *dev;
> @@ -81,10 +74,11 @@ struct spi_geni_master {
>         unsigned int tx_rem_bytes;
>         unsigned int rx_rem_bytes;
>         const struct spi_transfer *cur_xfer;
> -       struct completion xfer_done;
> +       struct completion cs_done;
> +       struct completion cancel_done;
> +       struct completion abort_done;

I wonder if it would be better to use the wait_bit() APIs. That would
let us have one word for various bits like CS_DONE, CANCEL_DONE,
ABORT_DONE and then wake up the waiters when the particular bit happens
to come in through the isr. It is probably over-engineering though
because it saves a handful of bytes while increasing complexity.

Otherwise I like this patch.

>         unsigned int oversampling;
>         spinlock_t lock;
> -       enum spi_m_cmd_opcode cur_mcmd;
>         int irq;
>  };
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking
  2020-06-17 20:53   ` Stephen Boyd
@ 2020-06-17 21:19     ` Doug Anderson
  2020-06-18  0:47       ` Stephen Boyd
  0 siblings, 1 reply; 15+ messages in thread
From: Doug Anderson @ 2020-06-17 21:19 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mark Brown, Alok Chauhan, skakit, Andy Gross, Bjorn Andersson,
	Dilip Kota, linux-arm-msm, LKML, linux-spi

Hi,


On Wed, Jun 17, 2020 at 1:53 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Douglas Anderson (2020-06-16 03:40:47)
> > If you added a bit of a delay (like a trace_printk) into the ISR for
> > the spi-geni-qcom driver, you would suddenly start seeing some errors
> > spit out.  The problem was that, though the ISR itself held a lock,
> > other parts of the driver didn't always grab the lock.
> >
> > One example race was this:
> > a) Driver queues off a command to set a Chip Select (CS).
> > b) ISR fires indicating the CS is done.
> > c) Done bit is set, so we complete().
> > d) Second CPU gallops off and starts a transfer.
> > e) Second CPU starts messing with hardware / software state (not under
> >    spinlock).
> > f) ISR now does things like set "mas->cur_mcmd" to CMD_NONE, prints
> >    errors if "tx_rem_bytes" / "rx_rem_bytes" have been set, and also
> >    Acks all interrupts it handled.
>
> Can we get a CPU0/CPU1 diagram here? At point e) I got sort of lost. And
> maybe it's not even a dual CPU problem? i.e. it can happen on one CPU?
>
>     CPU0
>     ----
>  a) spi_geni_set_cs()
>      mas->cur_mcmd = CMD_CS
>      wait_for_completion_timeout(&xfer_done)
>  b)  <INTERRUPT>
>      geni_spi_isr()
>  c)   complete(&xfer_done);
>      <END INTERRUPT>
>      pm_runtime_put(mas->dev);
>  d) galloping?
>
> I got lost... Sorry!

I think you need two CPUs, at least for the race I'm thinking of.
Maybe this is clearer?

CPU1:
=> spi_geni_set_cs() starts
=> spi_geni_set_cs() calls wait_for_completion_timeout(&xfer_done)
CPU0:
=> geni_spi_isr() starts
=> geni_spi_isr() calls complete(&xfer_done)
=> geni_spi_isr() stalls
CPU1:
=> spi_geni_set_cs() call to wait_for_completion_timeout() finishes
=> spi_geni_set_cs() exits.
=> spi_geni_transfer_one() is called
=> spi_geni_transfer_one() calls setup_fifo_xfer()
=> setup_fifo_xfer() sets "cur_mcmd"
=> setup_fifo_xfer() sets "tx_rem_bytes"
=> setup_fifo_xfer() sets "rx_rem_bytes"
=> setup_fifo_xfer() kicks off a transfer
CPU0:
=> geni_spi_isr() finishes stalling
=> geni_spi_isr() sets "cur_mcmd" to NULL
=> geni_spi_isr() checks "tx_rem_bytes" to confirm it's 0.
=> geni_spi_isr() checks "rx_rem_bytes" to confirm it's 0.
=> geni_spi_isr() clears any "DONE" interrupt that is pending

I can update the commit message to have that if it's helpful and makes
sense.  In the above example I have a fake "stall" that wouldn't
really happen, but in general if adding a delay somewhere creates a
race condition then the race condition was there anyway.  Also, with
weakly ordered memory it's possible that a write on one CPU could
clobber a write made by another CPU even if they happened in opposite
orders.

The race is fixed by my patch because when CPU1 starts
setup_fifo_xfer() it won't be able to grab the spinlock until the ISR
is totally done.


> > Let's fix this.  Before we start messing with hardware, we'll grab the
> > lock to make sure that the IRQ handler from some previous command has
> > really finished.  We don't need to hold the lock unless we're in a
> > state where more interrupts can come in, but we at least need to make
> > sure the previous IRQ is done.  This lock is used exclusively to
> > prevent the IRQ handler and non-IRQ from stomping on each other.  The
> > SPI core handles all other mutual exclusion.
>
> Ok maybe the problem is the completion at c) never happens until the
> wait_for_completion_timeout() times out?

No need for a timeout to happen.  Just putting a few "trace_printk"
calls in the ISR at strategic places was enough for me to see the
race.


> > As part of this, we change the way that the IRQ handler detects
> > spurious interrupts.  Previously we checked for our state variable
> > being set to IRQ_NONE, but that was done outside the spinlock.  We
> > could move it into the spinlock, but instead let's just change it to
> > look for the lack of any IRQ status bits being set.  This can be done
> > outside the lock--the hardware certainly isn't grabbing or looking at
> > the spinlock when it updates its status register.
> >
> > It's possible that this will fix real (but very rare) errors seen in
> > the field that look like:
> >   irq ...: nobody cared (try booting with the "irqpoll" option)
> >
> > NOTE: an alternate strategy considered here was to always make the
> > complete() / spi_finalize_current_transfer() the very last thing in
> > our IRQ handler.  With such a change you could consider that we could
> > be "lockless".  In that case, though, we'd have to be very careful w/
> > memory barriers so we made sure we didn't have any bugs with weakly
> > ordered memory.  Using spinlocks makes the driver much easier to
> > understand.
> >
> > Fixes: 561de45f72bd ("spi: spi-geni-qcom: Add SPI driver support for GENI based QUP")
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> > Changes in v3:
> > - Split out some lock cleanup to previous patch.
> > - Don't need to read IRQ status register inside spinlock.
> > - Don't check for state CMD_NONE; later patch is removing state var.
> > - Don't hold the lock for all of setup_fifo_xfer().
> > - Comment about why it's safe to Ack interrupts at the end.
> > - Subject/desc changed since race is definitely there.
> >
> > Changes in v2:
> > - Detect true spurious interrupt.
> > - Still return IRQ_NONE for state machine mismatch, but print warn.
> >
> >  drivers/spi/spi-geni-qcom.c | 45 ++++++++++++++++++++++++++++++++++---
> >  1 file changed, 42 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> > index c7d2c7e45b3f..e0f0e5c241f3 100644
> > --- a/drivers/spi/spi-geni-qcom.c
> > +++ b/drivers/spi/spi-geni-qcom.c
> > @@ -367,6 +384,12 @@ static void setup_fifo_xfer(struct spi_transfer *xfer,
> >         }
> >         writel(spi_tx_cfg, se->base + SE_SPI_TRANS_CFG);
> >         mas->cur_mcmd = CMD_XFER;
> > +
> > +       /*
> > +        * Lock around right before we start the transfer since our
> > +        * interrupt controller could come in at any time now.
>
> drop 'controller'?

Sure, I'll fix in the next spin.

-Doug

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking
  2020-06-17 21:19     ` Doug Anderson
@ 2020-06-18  0:47       ` Stephen Boyd
  0 siblings, 0 replies; 15+ messages in thread
From: Stephen Boyd @ 2020-06-18  0:47 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Mark Brown, Alok Chauhan, skakit, Andy Gross, Bjorn Andersson,
	Dilip Kota, linux-arm-msm, LKML, linux-spi

Quoting Doug Anderson (2020-06-17 14:19:29)
> On Wed, Jun 17, 2020 at 1:53 PM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > Quoting Douglas Anderson (2020-06-16 03:40:47)
> > > If you added a bit of a delay (like a trace_printk) into the ISR for
> > > the spi-geni-qcom driver, you would suddenly start seeing some errors
> > > spit out.  The problem was that, though the ISR itself held a lock,
> > > other parts of the driver didn't always grab the lock.
> > >
> > > One example race was this:
> > > a) Driver queues off a command to set a Chip Select (CS).
> > > b) ISR fires indicating the CS is done.
> > > c) Done bit is set, so we complete().
> > > d) Second CPU gallops off and starts a transfer.
> > > e) Second CPU starts messing with hardware / software state (not under
> > >    spinlock).
> > > f) ISR now does things like set "mas->cur_mcmd" to CMD_NONE, prints
> > >    errors if "tx_rem_bytes" / "rx_rem_bytes" have been set, and also
> > >    Acks all interrupts it handled.
> >
> > Can we get a CPU0/CPU1 diagram here? At point e) I got sort of lost. And
> > maybe it's not even a dual CPU problem? i.e. it can happen on one CPU?
> >
> >     CPU0
> >     ----
> >  a) spi_geni_set_cs()
> >      mas->cur_mcmd = CMD_CS
> >      wait_for_completion_timeout(&xfer_done)
> >  b)  <INTERRUPT>
> >      geni_spi_isr()
> >  c)   complete(&xfer_done);
> >      <END INTERRUPT>
> >      pm_runtime_put(mas->dev);
> >  d) galloping?
> >
> > I got lost... Sorry!
> 
> I think you need two CPUs, at least for the race I'm thinking of.
> Maybe this is clearer?

With threaded irqs I think you only need one CPU, but that's just a
minor detail. Drawing it with two CPUs is clearer and easier to
understand.

> 
> CPU1:
> => spi_geni_set_cs() starts
> => spi_geni_set_cs() calls wait_for_completion_timeout(&xfer_done)
> CPU0:
> => geni_spi_isr() starts
> => geni_spi_isr() calls complete(&xfer_done)
> => geni_spi_isr() stalls
> CPU1:
> => spi_geni_set_cs() call to wait_for_completion_timeout() finishes
> => spi_geni_set_cs() exits.
> => spi_geni_transfer_one() is called
> => spi_geni_transfer_one() calls setup_fifo_xfer()
> => setup_fifo_xfer() sets "cur_mcmd"
> => setup_fifo_xfer() sets "tx_rem_bytes"
> => setup_fifo_xfer() sets "rx_rem_bytes"
> => setup_fifo_xfer() kicks off a transfer
> CPU0:
> => geni_spi_isr() finishes stalling
> => geni_spi_isr() sets "cur_mcmd" to NULL
> => geni_spi_isr() checks "tx_rem_bytes" to confirm it's 0.
> => geni_spi_isr() checks "rx_rem_bytes" to confirm it's 0.
> => geni_spi_isr() clears any "DONE" interrupt that is pending
> 
> I can update the commit message to have that if it's helpful and makes
> sense.  In the above example I have a fake "stall" that wouldn't
> really happen, but in general if adding a delay somewhere creates a
> race condition then the race condition was there anyway.  Also, with
> weakly ordered memory it's possible that a write on one CPU could
> clobber a write made by another CPU even if they happened in opposite
> orders.
> 
> The race is fixed by my patch because when CPU1 starts
> setup_fifo_xfer() it won't be able to grab the spinlock until the ISR
> is totally done.

Ok. This would be the diagram then if it looked like this:

  CPU0                                         CPU1
  ----                                         ----
  spi_geni_set_cs()
   mas->cur_mcmd = CMD_CS;
   geni_se_setup_m_cmd(...)
   wait_for_completion_timeout(&xfer_done);
                                              <INTERRUPT>
                                               geni_spi_isr()
                                                complete(&xfer_done);
   <wakeup>
   pm_runtime_put(mas->dev);
  ... // back to SPI core
  spi_geni_transfer_one()
   setup_fifo_xfer()  
    mas->cur_mcmd = CMD_XFER;
                                                mas->cur_cmd = CMD_NONE; // bad!
                                                return IRQ_HANDLED;

     
Time flows down as it usually does in these diagrams. No need to put
stalls in. Reading all those lines and holding which CPU they're running
on makes it harder for me to see the two running in parallel like is
shown above.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable
  2020-06-17 21:16   ` Stephen Boyd
@ 2020-06-18 15:05     ` Doug Anderson
  0 siblings, 0 replies; 15+ messages in thread
From: Doug Anderson @ 2020-06-18 15:05 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Mark Brown, Alok Chauhan, skakit, Andy Gross, Bjorn Andersson,
	linux-arm-msm, LKML, linux-spi

Hi,

On Wed, Jun 17, 2020 at 2:16 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Douglas Anderson (2020-06-16 03:40:50)
> > diff --git a/drivers/spi/spi-geni-qcom.c b/drivers/spi/spi-geni-qcom.c
> > index 63a62548b078..6feea88d63ac 100644
> > --- a/drivers/spi/spi-geni-qcom.c
> > +++ b/drivers/spi/spi-geni-qcom.c
> > @@ -63,13 +63,6 @@
> >  #define TIMESTAMP_AFTER                BIT(3)
> >  #define POST_CMD_DELAY         BIT(4)
> >
> > -enum spi_m_cmd_opcode {
> > -       CMD_NONE,
> > -       CMD_XFER,
> > -       CMD_CS,
> > -       CMD_CANCEL,
> > -};
> > -
> >  struct spi_geni_master {
> >         struct geni_se se;
> >         struct device *dev;
> > @@ -81,10 +74,11 @@ struct spi_geni_master {
> >         unsigned int tx_rem_bytes;
> >         unsigned int rx_rem_bytes;
> >         const struct spi_transfer *cur_xfer;
> > -       struct completion xfer_done;
> > +       struct completion cs_done;
> > +       struct completion cancel_done;
> > +       struct completion abort_done;
>
> I wonder if it would be better to use the wait_bit() APIs. That would
> let us have one word for various bits like CS_DONE, CANCEL_DONE,
> ABORT_DONE and then wake up the waiters when the particular bit happens
> to come in through the isr. It is probably over-engineering though
> because it saves a handful of bytes while increasing complexity.
>
> Otherwise I like this patch.

I'm going to leave it as-is for v4.  We can always make it a future
improvement if we want.

-Doug

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements
  2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
                   ` (4 preceding siblings ...)
  2020-06-16 10:40 ` [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable Douglas Anderson
@ 2020-06-18 16:48 ` Mark Brown
  5 siblings, 0 replies; 15+ messages in thread
From: Mark Brown @ 2020-06-18 16:48 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Bjorn Andersson, Alok Chauhan, skakit, linux-kernel, linux-spi,
	Dilip Kota, Andy Gross, linux-arm-msm, swboyd

On Tue, 16 Jun 2020 03:40:45 -0700, Douglas Anderson wrote:
> This patch series is a new version of the previous patch posted:
>   [PATCH v2] spi: spi-geni-qcom: Speculative fix of "nobody cared" about interrupt
>   https://lore.kernel.org/r/20200317133653.v2.1.I752ebdcfd5e8bf0de06d66e767b8974932b3620e@changeid
> 
> At this point I've done enough tracing to know that there was a real
> race in the old code (not just weakly ordered memory problems) and
> that should be fixed with the locking patches.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/1] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls
      commit: 539afdf969d6ad7ded543d9abf14596aec411fe9

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-06-18 16:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16 10:40 [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Douglas Anderson
2020-06-16 10:40 ` [PATCH v3 1/5] spi: spi-geni-qcom: No need for irqsave variant of spinlock calls Douglas Anderson
2020-06-17 20:24   ` Stephen Boyd
2020-06-16 10:40 ` [PATCH v3 2/5] spi: spi-geni-qcom: Mo' betta locking Douglas Anderson
2020-06-17 20:53   ` Stephen Boyd
2020-06-17 21:19     ` Doug Anderson
2020-06-18  0:47       ` Stephen Boyd
2020-06-16 10:40 ` [PATCH v3 3/5] spi: spi-geni-qcom: Check for error IRQs Douglas Anderson
2020-06-17 20:54   ` Stephen Boyd
2020-06-16 10:40 ` [PATCH v3 4/5] spi: spi-geni-qcom: Actually use our FIFO Douglas Anderson
2020-06-17 20:56   ` Stephen Boyd
2020-06-16 10:40 ` [PATCH v3 5/5] spi: spi-geni-qcom: Don't keep a local state variable Douglas Anderson
2020-06-17 21:16   ` Stephen Boyd
2020-06-18 15:05     ` Doug Anderson
2020-06-18 16:48 ` [PATCH v3 0/5] spi: spi-geni-qcom: Fixes / perf improvements Mark Brown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.