All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI receive
@ 2022-03-03 18:10 Robert Hancock
  2022-03-04  9:19 ` Claudiu.Beznea
  2022-03-04 12:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Robert Hancock @ 2022-03-03 18:10 UTC (permalink / raw)
  To: netdev
  Cc: nicolas.ferre, claudiu.beznea, davem, kuba, soren.brinkmann,
	scott.mcnutt, Robert Hancock, stable

There is an oddity in the way the RSR register flags propagate to the
ISR register (and the actual interrupt output) on this hardware: it
appears that RSR register bits only result in ISR being asserted if the
interrupt was actually enabled at the time, so enabling interrupts with
RSR bits already set doesn't trigger an interrupt to be raised. There
was already a partial fix for this race in the macb_poll function where
it checked for RSR bits being set and re-triggered NAPI receive.
However, there was a still a race window between checking RSR and
actually enabling interrupts, where a lost wakeup could happen. It's
necessary to check again after enabling interrupts to see if RSR was set
just prior to the interrupt being enabled, and re-trigger receive in that
case.

This issue was noticed in a point-to-point UDP request-response protocol
which periodically saw timeouts or abnormally high response times due to
received packets not being processed in a timely fashion. In many
applications, more packets arriving, including TCP retransmissions, would
cause the original packet to be processed, thus masking the issue.

Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done")
Cc: stable@vger.kernel.org
Co-developed-by: Scott McNutt <scott.mcnutt@siriusxm.com>
Signed-off-by: Scott McNutt <scott.mcnutt@siriusxm.com>
Signed-off-by: Robert Hancock <robert.hancock@calian.com>
---

Changes since v1:
-removed unrelated cleanup
-added notes on observed frequency of branches to comments

 drivers/net/ethernet/cadence/macb_main.c | 25 +++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 98498a76ae16..d13f06cf0308 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -1573,7 +1573,14 @@ static int macb_poll(struct napi_struct *napi, int budget)
 	if (work_done < budget) {
 		napi_complete_done(napi, work_done);
 
-		/* Packets received while interrupts were disabled */
+		/* RSR bits only seem to propagate to raise interrupts when
+		 * interrupts are enabled at the time, so if bits are already
+		 * set due to packets received while interrupts were disabled,
+		 * they will not cause another interrupt to be generated when
+		 * interrupts are re-enabled.
+		 * Check for this case here. This has been seen to happen
+		 * around 30% of the time under heavy network load.
+		 */
 		status = macb_readl(bp, RSR);
 		if (status) {
 			if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
@@ -1581,6 +1588,22 @@ static int macb_poll(struct napi_struct *napi, int budget)
 			napi_reschedule(napi);
 		} else {
 			queue_writel(queue, IER, bp->rx_intr_mask);
+
+			/* In rare cases, packets could have been received in
+			 * the window between the check above and re-enabling
+			 * interrupts. Therefore, a double-check is required
+			 * to avoid losing a wakeup. This can potentially race
+			 * with the interrupt handler doing the same actions
+			 * if an interrupt is raised just after enabling them,
+			 * but this should be harmless.
+			 */
+			status = macb_readl(bp, RSR);
+			if (unlikely(status)) {
+				queue_writel(queue, IDR, bp->rx_intr_mask);
+				if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
+					queue_writel(queue, ISR, MACB_BIT(RCOMP));
+				napi_schedule(napi);
+			}
 		}
 	}
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI receive
  2022-03-03 18:10 [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI receive Robert Hancock
@ 2022-03-04  9:19 ` Claudiu.Beznea
  2022-03-04 12:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Claudiu.Beznea @ 2022-03-04  9:19 UTC (permalink / raw)
  To: robert.hancock, netdev
  Cc: Nicolas.Ferre, davem, kuba, soren.brinkmann, scott.mcnutt, stable

On 03.03.2022 20:10, Robert Hancock wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> There is an oddity in the way the RSR register flags propagate to the
> ISR register (and the actual interrupt output) on this hardware: it
> appears that RSR register bits only result in ISR being asserted if the
> interrupt was actually enabled at the time, so enabling interrupts with
> RSR bits already set doesn't trigger an interrupt to be raised. There
> was already a partial fix for this race in the macb_poll function where
> it checked for RSR bits being set and re-triggered NAPI receive.
> However, there was a still a race window between checking RSR and
> actually enabling interrupts, where a lost wakeup could happen. It's
> necessary to check again after enabling interrupts to see if RSR was set
> just prior to the interrupt being enabled, and re-trigger receive in that
> case.
> 
> This issue was noticed in a point-to-point UDP request-response protocol
> which periodically saw timeouts or abnormally high response times due to
> received packets not being processed in a timely fashion. In many
> applications, more packets arriving, including TCP retransmissions, would
> cause the original packet to be processed, thus masking the issue.
> 
> Fixes: 02f7a34f34e3 ("net: macb: Re-enable RX interrupt only when RX is done")
> Cc: stable@vger.kernel.org
> Co-developed-by: Scott McNutt <scott.mcnutt@siriusxm.com>
> Signed-off-by: Scott McNutt <scott.mcnutt@siriusxm.com>
> Signed-off-by: Robert Hancock <robert.hancock@calian.com>

Tested on SAMA7G5:
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>

> ---
> 
> Changes since v1:
> -removed unrelated cleanup
> -added notes on observed frequency of branches to comments
> 
>  drivers/net/ethernet/cadence/macb_main.c | 25 +++++++++++++++++++++++-
>  1 file changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
> index 98498a76ae16..d13f06cf0308 100644
> --- a/drivers/net/ethernet/cadence/macb_main.c
> +++ b/drivers/net/ethernet/cadence/macb_main.c
> @@ -1573,7 +1573,14 @@ static int macb_poll(struct napi_struct *napi, int budget)
>         if (work_done < budget) {
>                 napi_complete_done(napi, work_done);
> 
> -               /* Packets received while interrupts were disabled */
> +               /* RSR bits only seem to propagate to raise interrupts when
> +                * interrupts are enabled at the time, so if bits are already
> +                * set due to packets received while interrupts were disabled,
> +                * they will not cause another interrupt to be generated when
> +                * interrupts are re-enabled.
> +                * Check for this case here. This has been seen to happen
> +                * around 30% of the time under heavy network load.
> +                */
>                 status = macb_readl(bp, RSR);
>                 if (status) {
>                         if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> @@ -1581,6 +1588,22 @@ static int macb_poll(struct napi_struct *napi, int budget)
>                         napi_reschedule(napi);
>                 } else {
>                         queue_writel(queue, IER, bp->rx_intr_mask);
> +
> +                       /* In rare cases, packets could have been received in
> +                        * the window between the check above and re-enabling
> +                        * interrupts. Therefore, a double-check is required
> +                        * to avoid losing a wakeup. This can potentially race
> +                        * with the interrupt handler doing the same actions
> +                        * if an interrupt is raised just after enabling them,
> +                        * but this should be harmless.
> +                        */
> +                       status = macb_readl(bp, RSR);
> +                       if (unlikely(status)) {
> +                               queue_writel(queue, IDR, bp->rx_intr_mask);
> +                               if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE)
> +                                       queue_writel(queue, ISR, MACB_BIT(RCOMP));
> +                               napi_schedule(napi);
> +                       }
>                 }
>         }
> 
> --
> 2.31.1
> 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI receive
  2022-03-03 18:10 [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI receive Robert Hancock
  2022-03-04  9:19 ` Claudiu.Beznea
@ 2022-03-04 12:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-04 12:10 UTC (permalink / raw)
  To: Robert Hancock
  Cc: netdev, nicolas.ferre, claudiu.beznea, davem, kuba,
	soren.brinkmann, scott.mcnutt, stable

Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <davem@davemloft.net>:

On Thu,  3 Mar 2022 12:10:27 -0600 you wrote:
> There is an oddity in the way the RSR register flags propagate to the
> ISR register (and the actual interrupt output) on this hardware: it
> appears that RSR register bits only result in ISR being asserted if the
> interrupt was actually enabled at the time, so enabling interrupts with
> RSR bits already set doesn't trigger an interrupt to be raised. There
> was already a partial fix for this race in the macb_poll function where
> it checked for RSR bits being set and re-triggered NAPI receive.
> However, there was a still a race window between checking RSR and
> actually enabling interrupts, where a lost wakeup could happen. It's
> necessary to check again after enabling interrupts to see if RSR was set
> just prior to the interrupt being enabled, and re-trigger receive in that
> case.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: macb: Fix lost RX packet wakeup race in NAPI receive
    https://git.kernel.org/netdev/net/c/0bf476fc3624

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-04 12:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-03 18:10 [PATCH net v2] net: macb: Fix lost RX packet wakeup race in NAPI receive Robert Hancock
2022-03-04  9:19 ` Claudiu.Beznea
2022-03-04 12:10 ` patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.