All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix TX/RX interrupt handling
@ 2023-03-10 17:20 Mario Limonciello
  2023-03-10 17:20 ` [PATCH 1/2] thunderbolt: Use const qualifier for `ring_interrupt_index` Mario Limonciello
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mario Limonciello @ 2023-03-10 17:20 UTC (permalink / raw)
  To: Basavaraj Natikar, Sanjay R Mehta, Mika Westerberg, linux-usb
  Cc: anson.tsao, Mario Limonciello, linux-kernel

Previously a patch series was sent up to change the way that DROM was read
to prefer directly from NVM instead of bit banging.

This series was produced due to issues found where TBT3 DROM CRC wouldn't
match.  In looking at it from USB4 analyzer the DROM wasn't corrupted
before it arrived at the router.  In analyzing the failure mode, every
single failure occurred during a retried TX because RX interrupt
"never came".

This was actually a smoking gun; when the hardware responded too quickly
both TX and RX interrupt status bits were set before the ISR would run.
By the ISR using auto clear on read to process the TX this would make the
RX interrupt bit get lost and the RX interrupt was never handled.

To fix this issue, disable auto clear in the ISR and instead only clear
the interrupt that is actually triggering the ISR.

This fixes the communication for a long series of transactions such as
bit banging and probably also fixes other situations that control transfers
were retried a number of times due to a missing RX.

Mario Limonciello (2):
  thunderbolt: Use const qualifier for `ring_interrupt_index`
  thunderbolt: Disable interrupt auto clear for rings

 drivers/thunderbolt/nhi.c      | 42 +++++++++++++++++++++-------------
 drivers/thunderbolt/nhi_regs.h |  6 +++--
 2 files changed, 30 insertions(+), 18 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] thunderbolt: Use const qualifier for `ring_interrupt_index`
  2023-03-10 17:20 [PATCH 0/2] Fix TX/RX interrupt handling Mario Limonciello
@ 2023-03-10 17:20 ` Mario Limonciello
  2023-03-10 17:20 ` [PATCH 2/2] thunderbolt: Disable interrupt auto clear for rings Mario Limonciello
  2023-03-14 14:28 ` [PATCH 0/2] Fix TX/RX interrupt handling Mika Westerberg
  2 siblings, 0 replies; 5+ messages in thread
From: Mario Limonciello @ 2023-03-10 17:20 UTC (permalink / raw)
  To: Andreas Noever, Michael Jamet, Mika Westerberg, Yehezkel Bernat
  Cc: anson.tsao, Mario Limonciello, Sanju Mehta, linux-usb, linux-kernel

`ring_interrupt_index` doesn't change the data for `ring` so
mark it as const.

Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/thunderbolt/nhi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index 4dce2edd86ea0..fdc0c3ba2ef01 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -46,7 +46,7 @@
 #define QUIRK_AUTO_CLEAR_INT	BIT(0)
 #define QUIRK_E2E		BIT(1)
 
-static int ring_interrupt_index(struct tb_ring *ring)
+static int ring_interrupt_index(const struct tb_ring *ring)
 {
 	int bit = ring->hop;
 	if (!ring->is_tx)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] thunderbolt: Disable interrupt auto clear for rings
  2023-03-10 17:20 [PATCH 0/2] Fix TX/RX interrupt handling Mario Limonciello
  2023-03-10 17:20 ` [PATCH 1/2] thunderbolt: Use const qualifier for `ring_interrupt_index` Mario Limonciello
@ 2023-03-10 17:20 ` Mario Limonciello
  2023-03-13  5:46   ` Mika Westerberg
  2023-03-14 14:28 ` [PATCH 0/2] Fix TX/RX interrupt handling Mika Westerberg
  2 siblings, 1 reply; 5+ messages in thread
From: Mario Limonciello @ 2023-03-10 17:20 UTC (permalink / raw)
  To: Andreas Noever, Michael Jamet, Mika Westerberg, Yehezkel Bernat,
	Basavaraj Natikar, Sanjay R Mehta
  Cc: anson.tsao, Mario Limonciello, Sanju Mehta, linux-usb, linux-kernel

When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts.  If two interrupts have
come in before one can be serviced then this will cause lost interrupts.

On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.

Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.

Fixes: 7a1808f82a37 ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <Sanju.Mehta@amd.com>
Tested-by: Anson Tsao <anson.tsao@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/thunderbolt/nhi.c      | 40 +++++++++++++++++++++-------------
 drivers/thunderbolt/nhi_regs.h |  6 +++--
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/drivers/thunderbolt/nhi.c b/drivers/thunderbolt/nhi.c
index fdc0c3ba2ef01..318d20bd5b695 100644
--- a/drivers/thunderbolt/nhi.c
+++ b/drivers/thunderbolt/nhi.c
@@ -71,24 +71,31 @@ static void ring_interrupt_active(struct tb_ring *ring, bool active)
 		u32 step, shift, ivr, misc;
 		void __iomem *ivr_base;
 		int index;
+		int bit;
 
 		if (ring->is_tx)
 			index = ring->hop;
 		else
 			index = ring->hop + ring->nhi->hop_count;
 
-		if (ring->nhi->quirks & QUIRK_AUTO_CLEAR_INT) {
-			/*
-			 * Ask the hardware to clear interrupt status
-			 * bits automatically since we already know
-			 * which interrupt was triggered.
-			 */
-			misc = ioread32(ring->nhi->iobase + REG_DMA_MISC);
-			if (!(misc & REG_DMA_MISC_INT_AUTO_CLEAR)) {
-				misc |= REG_DMA_MISC_INT_AUTO_CLEAR;
-				iowrite32(misc, ring->nhi->iobase + REG_DMA_MISC);
-			}
-		}
+		/*
+		 * Intel routers support a bit that isn't part of
+		 * the USB4 spec to ask the hardware to clear
+		 * interrupt status bits automatically since
+		 * we already know which interrupt was triggered.
+		 *
+		 * Other routers explicitly disable auto-clear
+		 * to prevent conditions that may occur where two
+		 * MSIX interrupts are simultaneously active and
+		 * reading the register clears both of them.
+		 */
+		misc = ioread32(ring->nhi->iobase + REG_DMA_MISC);
+		if (ring->nhi->quirks & QUIRK_AUTO_CLEAR_INT)
+			bit = REG_DMA_MISC_INT_AUTO_CLEAR;
+		else
+			bit = REG_DMA_MISC_DISABLE_AUTO_CLEAR;
+		if (!(misc & bit))
+			iowrite32(misc | bit, ring->nhi->iobase + REG_DMA_MISC);
 
 		ivr_base = ring->nhi->iobase + REG_INT_VEC_ALLOC_BASE;
 		step = index / REG_INT_VEC_ALLOC_REGS * REG_INT_VEC_ALLOC_BITS;
@@ -393,14 +400,17 @@ EXPORT_SYMBOL_GPL(tb_ring_poll_complete);
 
 static void ring_clear_msix(const struct tb_ring *ring)
 {
+	int bit;
+
 	if (ring->nhi->quirks & QUIRK_AUTO_CLEAR_INT)
 		return;
 
+	bit = ring_interrupt_index(ring) & 31;
 	if (ring->is_tx)
-		ioread32(ring->nhi->iobase + REG_RING_NOTIFY_BASE);
+		iowrite32(BIT(bit), ring->nhi->iobase + REG_RING_INT_CLEAR);
 	else
-		ioread32(ring->nhi->iobase + REG_RING_NOTIFY_BASE +
-			 4 * (ring->nhi->hop_count / 32));
+		iowrite32(BIT(bit), ring->nhi->iobase + REG_RING_INT_CLEAR +
+			  4 * (ring->nhi->hop_count / 32));
 }
 
 static irqreturn_t ring_msix(int irq, void *data)
diff --git a/drivers/thunderbolt/nhi_regs.h b/drivers/thunderbolt/nhi_regs.h
index 0d4970dcef842..faef165a919cc 100644
--- a/drivers/thunderbolt/nhi_regs.h
+++ b/drivers/thunderbolt/nhi_regs.h
@@ -77,12 +77,13 @@ struct ring_desc {
 
 /*
  * three bitfields: tx, rx, rx overflow
- * Every bitfield contains one bit for every hop (REG_HOP_COUNT). Registers are
- * cleared on read. New interrupts are fired only after ALL registers have been
+ * Every bitfield contains one bit for every hop (REG_HOP_COUNT).
+ * New interrupts are fired only after ALL registers have been
  * read (even those containing only disabled rings).
  */
 #define REG_RING_NOTIFY_BASE	0x37800
 #define RING_NOTIFY_REG_COUNT(nhi) ((31 + 3 * nhi->hop_count) / 32)
+#define REG_RING_INT_CLEAR	0x37808
 
 /*
  * two bitfields: rx, tx
@@ -105,6 +106,7 @@ struct ring_desc {
 
 #define REG_DMA_MISC			0x39864
 #define REG_DMA_MISC_INT_AUTO_CLEAR     BIT(2)
+#define REG_DMA_MISC_DISABLE_AUTO_CLEAR	BIT(17)
 
 #define REG_INMAIL_DATA			0x39900
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] thunderbolt: Disable interrupt auto clear for rings
  2023-03-10 17:20 ` [PATCH 2/2] thunderbolt: Disable interrupt auto clear for rings Mario Limonciello
@ 2023-03-13  5:46   ` Mika Westerberg
  0 siblings, 0 replies; 5+ messages in thread
From: Mika Westerberg @ 2023-03-13  5:46 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: Andreas Noever, Michael Jamet, Yehezkel Bernat,
	Basavaraj Natikar, Sanjay R Mehta, anson.tsao, linux-usb,
	linux-kernel

Hi Mario,

On Fri, Mar 10, 2023 at 11:20:50AM -0600, Mario Limonciello wrote:
> When interrupt auto clear is programmed, any read to the interrupt
> status register will clear all interrupts.  If two interrupts have
> come in before one can be serviced then this will cause lost interrupts.
> 
> On AMD USB4 routers this has manifested in odd problems particularly
> with long strings of control tranfers such as reading the DROM via bit
> banging.

Nice catch! Does this mean we can drop [1] now?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt.git/commit/?h=next&id=8d7f459107f74fbbdde3dd5b3874d2e748cb8a21

I would still like to keep the nice refactor you did for the DROM
parsing, though.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/2] Fix TX/RX interrupt handling
  2023-03-10 17:20 [PATCH 0/2] Fix TX/RX interrupt handling Mario Limonciello
  2023-03-10 17:20 ` [PATCH 1/2] thunderbolt: Use const qualifier for `ring_interrupt_index` Mario Limonciello
  2023-03-10 17:20 ` [PATCH 2/2] thunderbolt: Disable interrupt auto clear for rings Mario Limonciello
@ 2023-03-14 14:28 ` Mika Westerberg
  2 siblings, 0 replies; 5+ messages in thread
From: Mika Westerberg @ 2023-03-14 14:28 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: Basavaraj Natikar, Sanjay R Mehta, linux-usb, anson.tsao, linux-kernel

Hi Mario,

On Fri, Mar 10, 2023 at 11:20:48AM -0600, Mario Limonciello wrote:
> Previously a patch series was sent up to change the way that DROM was read
> to prefer directly from NVM instead of bit banging.
> 
> This series was produced due to issues found where TBT3 DROM CRC wouldn't
> match.  In looking at it from USB4 analyzer the DROM wasn't corrupted
> before it arrived at the router.  In analyzing the failure mode, every
> single failure occurred during a retried TX because RX interrupt
> "never came".
> 
> This was actually a smoking gun; when the hardware responded too quickly
> both TX and RX interrupt status bits were set before the ISR would run.
> By the ISR using auto clear on read to process the TX this would make the
> RX interrupt bit get lost and the RX interrupt was never handled.
> 
> To fix this issue, disable auto clear in the ISR and instead only clear
> the interrupt that is actually triggering the ISR.
> 
> This fixes the communication for a long series of transactions such as
> bit banging and probably also fixes other situations that control transfers
> were retried a number of times due to a missing RX.
> 
> Mario Limonciello (2):
>   thunderbolt: Use const qualifier for `ring_interrupt_index`
>   thunderbolt: Disable interrupt auto clear for rings

Applied both to thunderbolt.git/fixes for v6.3-rc and marked them for
stable as well. Thanks! I dropped the other patch that adjusted the NVM
reading as now it is not necessary anymore (please correct me if I'm
mistaken).

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-03-14 14:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 17:20 [PATCH 0/2] Fix TX/RX interrupt handling Mario Limonciello
2023-03-10 17:20 ` [PATCH 1/2] thunderbolt: Use const qualifier for `ring_interrupt_index` Mario Limonciello
2023-03-10 17:20 ` [PATCH 2/2] thunderbolt: Disable interrupt auto clear for rings Mario Limonciello
2023-03-13  5:46   ` Mika Westerberg
2023-03-14 14:28 ` [PATCH 0/2] Fix TX/RX interrupt handling Mika Westerberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.