netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
@ 2021-03-04 19:52 michael-dev
  2021-03-04 20:17 ` Vladimir Oltean
  2021-03-05 21:30 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 4+ messages in thread
From: michael-dev @ 2021-03-04 19:52 UTC (permalink / raw)
  To: claudiu.manoil; +Cc: netdev, Michael Braun

From: Michael Braun <michael-dev@fami-braun.de>

When using jumbo packets and overrunning rx queue with napi enabled,
the following sequence is observed in gfar_add_rx_frag:

   | lstatus                              |       | skb                   |
t  | lstatus,  size, flags                | first | len, data_len, *ptr   |
---+--------------------------------------+-------+-----------------------+
13 | 18002348, 9032, INTERRUPT LAST       | 0     | 9600, 8000,  f554c12e |
12 | 10000640, 1600, INTERRUPT            | 0     | 8000, 6400,  f554c12e |
11 | 10000640, 1600, INTERRUPT            | 0     | 6400, 4800,  f554c12e |
10 | 10000640, 1600, INTERRUPT            | 0     | 4800, 3200,  f554c12e |
09 | 10000640, 1600, INTERRUPT            | 0     | 3200, 1600,  f554c12e |
08 | 14000640, 1600, INTERRUPT FIRST      | 0     | 1600, 0,     f554c12e |
07 | 14000640, 1600, INTERRUPT FIRST      | 1     | 0,    0,     f554c12e |
06 | 1c000080, 128,  INTERRUPT LAST FIRST | 1     | 0,    0,     abf3bd6e |
05 | 18002348, 9032, INTERRUPT LAST       | 0     | 8000, 6400,  c5a57780 |
04 | 10000640, 1600, INTERRUPT            | 0     | 6400, 4800,  c5a57780 |
03 | 10000640, 1600, INTERRUPT            | 0     | 4800, 3200,  c5a57780 |
02 | 10000640, 1600, INTERRUPT            | 0     | 3200, 1600,  c5a57780 |
01 | 10000640, 1600, INTERRUPT            | 0     | 1600, 0,     c5a57780 |
00 | 14000640, 1600, INTERRUPT FIRST      | 1     | 0,    0,     c5a57780 |

So at t=7 a new packets is started but not finished, probably due to rx
overrun - but rx overrun is not indicated in the flags. Instead a new
packets starts at t=8. This results in skb->len to exceed size for the LAST
fragment at t=13 and thus a negative fragment size added to the skb.

This then crashes:

kernel BUG at include/linux/skbuff.h:2277!
Oops: Exception in kernel mode, sig: 5 [#1]
...
NIP [c04689f4] skb_pull+0x2c/0x48
LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844
Call Trace:
[ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable)
[ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4
[ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c
[ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0
[ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc
[ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70
[ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc
[ec4bff08] [c0062718] kthread+0x140/0x144
[ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c

This patch fixes this by checking for computed LAST fragment size, so a
negative sized fragment is never added.
In order to prevent the newer rx frame from getting corrupted, the FIRST
flag is checked to discard the incomplete older frame.

Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
---
 drivers/net/ethernet/freescale/gianfar.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 541de32ea662..1cf8ef717453 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -2390,6 +2390,10 @@ static bool gfar_add_rx_frag(struct gfar_rx_buff *rxb, u32 lstatus,
 		if (lstatus & BD_LFLAG(RXBD_LAST))
 			size -= skb->len;
 
+		WARN(size < 0, "gianfar: rx fragment size underflow");
+		if (size < 0)
+			return false;
+
 		skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
 				rxb->page_offset + RXBUF_ALIGNMENT,
 				size, GFAR_RXB_TRUESIZE);
@@ -2552,6 +2556,17 @@ static int gfar_clean_rx_ring(struct gfar_priv_rx_q *rx_queue,
 		if (lstatus & BD_LFLAG(RXBD_EMPTY))
 			break;
 
+		/* lost RXBD_LAST descriptor due to overrun */
+		if (skb &&
+		    (lstatus & BD_LFLAG(RXBD_FIRST))) {
+			/* discard faulty buffer */
+			dev_kfree_skb(skb);
+			skb = NULL;
+			rx_queue->stats.rx_dropped++;
+
+			/* can continue normally */
+		}
+
 		/* order rx buffer descriptor reads */
 		rmb();
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
  2021-03-04 19:52 [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash michael-dev
@ 2021-03-04 20:17 ` Vladimir Oltean
  2021-03-07 20:43   ` michael-dev
  2021-03-05 21:30 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 4+ messages in thread
From: Vladimir Oltean @ 2021-03-04 20:17 UTC (permalink / raw)
  To: michael-dev; +Cc: claudiu.manoil, netdev

Hi Michael,

On Thu, Mar 04, 2021 at 08:52:52PM +0100, michael-dev@fami-braun.de wrote:
> From: Michael Braun <michael-dev@fami-braun.de>
>
> When using jumbo packets and overrunning rx queue with napi enabled,
> the following sequence is observed in gfar_add_rx_frag:
>
>    | lstatus                              |       | skb                   |
> t  | lstatus,  size, flags                | first | len, data_len, *ptr   |
> ---+--------------------------------------+-------+-----------------------+
> 13 | 18002348, 9032, INTERRUPT LAST       | 0     | 9600, 8000,  f554c12e |
> 12 | 10000640, 1600, INTERRUPT            | 0     | 8000, 6400,  f554c12e |
> 11 | 10000640, 1600, INTERRUPT            | 0     | 6400, 4800,  f554c12e |
> 10 | 10000640, 1600, INTERRUPT            | 0     | 4800, 3200,  f554c12e |
> 09 | 10000640, 1600, INTERRUPT            | 0     | 3200, 1600,  f554c12e |
> 08 | 14000640, 1600, INTERRUPT FIRST      | 0     | 1600, 0,     f554c12e |
> 07 | 14000640, 1600, INTERRUPT FIRST      | 1     | 0,    0,     f554c12e |
> 06 | 1c000080, 128,  INTERRUPT LAST FIRST | 1     | 0,    0,     abf3bd6e |
> 05 | 18002348, 9032, INTERRUPT LAST       | 0     | 8000, 6400,  c5a57780 |
> 04 | 10000640, 1600, INTERRUPT            | 0     | 6400, 4800,  c5a57780 |
> 03 | 10000640, 1600, INTERRUPT            | 0     | 4800, 3200,  c5a57780 |
> 02 | 10000640, 1600, INTERRUPT            | 0     | 3200, 1600,  c5a57780 |
> 01 | 10000640, 1600, INTERRUPT            | 0     | 1600, 0,     c5a57780 |
> 00 | 14000640, 1600, INTERRUPT FIRST      | 1     | 0,    0,     c5a57780 |
>
> So at t=7 a new packets is started but not finished, probably due to rx
> overrun - but rx overrun is not indicated in the flags. Instead a new
> packets starts at t=8. This results in skb->len to exceed size for the LAST
> fragment at t=13 and thus a negative fragment size added to the skb.
>
> This then crashes:
>
> kernel BUG at include/linux/skbuff.h:2277!
> Oops: Exception in kernel mode, sig: 5 [#1]
> ...
> NIP [c04689f4] skb_pull+0x2c/0x48
> LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844
> Call Trace:
> [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable)
> [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4
> [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c
> [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0
> [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc
> [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70
> [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc
> [ec4bff08] [c0062718] kthread+0x140/0x144
> [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c
>
> This patch fixes this by checking for computed LAST fragment size, so a
> negative sized fragment is never added.
> In order to prevent the newer rx frame from getting corrupted, the FIRST
> flag is checked to discard the incomplete older frame.
>
> Signed-off-by: Michael Braun <michael-dev@fami-braun.de>
> ---

Just for my understanding, do you have a reproducer for the issue?
I notice you haven't answered Claudiu's questions posted on v1.
On LS1021A I cannot trigger this apparent hardware issue even if I force
RX overruns (by reducing the ring size). Judging from the "NIP" register
from your stack trace, this is a PowerPC device, which one is it?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
  2021-03-04 19:52 [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash michael-dev
  2021-03-04 20:17 ` Vladimir Oltean
@ 2021-03-05 21:30 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-03-05 21:30 UTC (permalink / raw)
  To: Michael Braun; +Cc: claudiu.manoil, netdev

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Thu,  4 Mar 2021 20:52:52 +0100 you wrote:
> From: Michael Braun <michael-dev@fami-braun.de>
> 
> When using jumbo packets and overrunning rx queue with napi enabled,
> the following sequence is observed in gfar_add_rx_frag:
> 
>    | lstatus                              |       | skb                   |
> t  | lstatus,  size, flags                | first | len, data_len, *ptr   |
> 
> [...]

Here is the summary with links:
  - [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
    https://git.kernel.org/netdev/net/c/d8861bab48b6

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash
  2021-03-04 20:17 ` Vladimir Oltean
@ 2021-03-07 20:43   ` michael-dev
  0 siblings, 0 replies; 4+ messages in thread
From: michael-dev @ 2021-03-07 20:43 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: claudiu.manoil, netdev

Am 04.03.2021 21:17, schrieb Vladimir Oltean:
> Just for my understanding, do you have a reproducer for the issue?
> I notice you haven't answered Claudiu's questions posted on v1.
> On LS1021A I cannot trigger this apparent hardware issue even if I 
> force
> RX overruns (by reducing the ring size). Judging from the "NIP" 
> register
> from your stack trace, this is a PowerPC device, which one is it?

This is on P1020WLAN (AP) devices and it happens reproducably when I use 
ipsec + iperf3 --udp -b 1000M on some other server targetting the AP.
Yes this is PPC.

Regards,
Michael


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-03-07 20:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04 19:52 [PATCHv2] gianfar: fix jumbo packets+napi+rx overrun crash michael-dev
2021-03-04 20:17 ` Vladimir Oltean
2021-03-07 20:43   ` michael-dev
2021-03-05 21:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).