All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] rtw88: fix skb_under_panic in tx path
@ 2020-06-25 20:18 Nick Owens
  2020-06-26  0:10 ` Thomas Pedersen
  0 siblings, 1 reply; 4+ messages in thread
From: Nick Owens @ 2020-06-25 20:18 UTC (permalink / raw)
  To: linux-wireless; +Cc: Yan-Hsuan Chuang

hello :)

this change fixes a reliable crash on my thinkpad A485.

please note i have no prior experience doing kernel development or
sending patches, and i'm not sure if this is a correct approach.

--

From aa589182d30a0f99e1b3201ed4f3830e8af71dac Mon Sep 17 00:00:00 2001
From: Nick Owens <mischief@offblast.org>
Date: Thu, 25 Jun 2020 12:55:41 -0700
Subject: [PATCH] rtw88: fix skb_under_panic in tx path

fixes the following panic on my thinkpad A485

Oops#1 Part3
<0>[ 3743.881656] skbuff: skb_under_panic: text:000000005f69fd98 len:208 put:48 head:000000009e2719e8 data:00000000bd3795e0 tail:0xc2 end:0x2c0 dev:wlp2s0
<4>[ 3743.881675] ------------[ cut here ]------------
<2>[ 3743.881677] kernel BUG at net/core/skbuff.c:109!
<4>[ 3743.881688] invalid opcode: 0000 [#1] SMP NOPTI
<4>[ 3743.881693] CPU: 7 PID: 665 Comm: irq/85-rtwpci Tainted: G            E     5.7.5 #31
<4>[ 3743.881695] Hardware name: LENOVO 20MU000TUS/20MU000TUS, BIOS R0WET56W (1.24 ) 06/28/2019
<4>[ 3743.881703] RIP: 0010:skb_panic+0x48/0x4a
<4>[ 3743.881706] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 68 14 51 bd e8 d5 22 ab ff <0f> 0b 48 8b 14 24 48 c7 c1 00 c7 2c bd e8 a6 ff ff ff 48 c7 c6 40
<4>[ 3743.881708] RSP: 0018:ffffb354002fce00 EFLAGS: 00010246
<4>[ 3743.881711] RAX: 0000000000000088 RBX: ffff954377fe1e80 RCX: 0000000000000000
<4>[ 3743.881713] RDX: 0000000000000000 RSI: ffff95437ffd8968 RDI: ffff95437ffd8968
<4>[ 3743.881714] RBP: ffff954362d7d000 R08: 0000000000000485 R09: 0000000000000097
<4>[ 3743.881716] R10: 0000000000000000 R11: ffffb354002fccb0 R12: 0000000000000030
<4>[ 3743.881717] R13: 0000000000000001 R14: ffffb354002fcf08 R15: ffffffffc163aba0
<4>[ 3743.881720] FS:  0000000000000000(0000) GS:ffff95437ffc0000(0000) knlGS:0000000000000000
<4>[ 3743.881721] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 3743.881723] CR2: 00007f63e45e8cb0 CR3: 00000003c7fb0000 CR4: 00000000003406e0
<4>[ 3743.881724] Call Trace:
<4>[ 3743.881728]  <IRQ>
<4>[ 3743.881733]  skb_push.cold.98+0x10/0x10
<4>[ 3743.881741]  rtw_pci_tx_write_data+0xb1/0x4e0 [rtwpci]
<4>[ 3743.881746]  rtw_pci_tx_write+0x59/0xe7 [rtwpci]
Panic#2 Part3
<4>[ 3743.881755]  rtw_tx_tasklet+0xfd/0x1f0 [rtw88]
<4>[ 3743.881763]  tasklet_action_common.isra.20+0x4e/0xf0
<4>[ 3743.881769]  __do_softirq+0xd9/0x2d9
<4>[ 3743.881773]  do_softirq_own_stack+0x2a/0x40
<4>[ 3743.881775]  </IRQ>
<4>[ 3743.881778]  do_softirq.part.18+0x2b/0x30
<4>[ 3743.881780]  __local_bh_enable_ip+0x4b/0x50
<4>[ 3743.881784]  rtw_pci_interrupt_threadfn+0x154/0x230 [rtwpci]
<4>[ 3743.881789]  ? irq_forced_thread_fn+0x70/0x70
<4>[ 3743.881791]  irq_thread_fn+0x1f/0x50
<4>[ 3743.881794]  irq_thread+0xe7/0x160
<4>[ 3743.881797]  ? wake_threads_waitq+0x30/0x30
<4>[ 3743.881800]  ? irq_thread_check_affinity+0x80/0x80
<4>[ 3743.881804]  kthread+0x112/0x130
<4>[ 3743.881807]  ? kthread_park+0x80/0x80
<4>[ 3743.881810]  ret_from_fork+0x22/0x40
<4>[ 3743.881813] Modules linked in: ctr(E) ccm(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) nft_counter(E) nft_chain_nat(E) xt_addrtype(E) nft_compat(E) nf_tables(E) nfnetlink(E) xt_conntrack(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) br_netfilter(E) bridge(E) stp(E) llc(E) overlay(E) cmac(E) bnep(E) nls_ascii(E) nls_cp437(E) vfat(E) snd_hda_codec_realtek(E) fat(E) btusb(E) btrtl(E) snd_hda_codec_generic(E) btbcm(E) rtwpci(E) btintel(E) uvcvideo(E) snd_hda_codec_hdmi(E) bluetooth(E) rtw88(E) videobuf2_vmalloc(E) edac_mce_amd(E) snd_hda_intel(E) videobuf2_memops(E) snd_intel_dspcfg(E) videobuf2_v4l2(E) mac80211(E) videobuf2_common(E) snd_hda_codec(E) drbg(E) efi_pstore(E) kvm_amd(E) ansi_cprng(E) joydev(E) snd_hda_core(E) kvm(E) snd_hwdep(E) videodev(E) ecdh_generic(E) snd_pcm(E) irqbypass(E) pcspkr(E) serio_raw(E) efivars(E) ftdi_sio(E) wmi_bmof(E) k10temp(E) mc(E) sp5100_tco(E) ecc(E) tpm_crb(E) usbserial(E) snd_timer(E) ccp(E) cfg80211(E)

Signed-off-by: Nick Owens <mischief@offblast.org>
Cc: stable@vger.kernel.org
---
 drivers/net/wireless/realtek/rtw88/pci.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c
index d735f3127fe8..21b3b268cb25 100644
--- a/drivers/net/wireless/realtek/rtw88/pci.c
+++ b/drivers/net/wireless/realtek/rtw88/pci.c
@@ -741,6 +741,12 @@ static int rtw_pci_tx_write_data(struct rtw_dev *rtwdev,
 	else if (!avail_desc(ring->r.wp, ring->r.rp, ring->r.len))
 		return -ENOSPC;
 
+	if (skb_headroom(skb) < chip->tx_pkt_desc_sz &&
+	    pskb_expand_head(skb, chip->tx_pkt_desc_sz - skb_headroom(skb), 0, GFP_ATOMIC)) {
+		dev_err(rtwdev->dev, "no headroom available");
+		return -ENOMEM;
+	}
+
 	pkt_desc = skb_push(skb, chip->tx_pkt_desc_sz);
 	memset(pkt_desc, 0, tx_pkt_desc_sz);
 	pkt_info->qsel = rtw_pci_get_tx_qsel(skb, queue);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] rtw88: fix skb_under_panic in tx path
  2020-06-25 20:18 [PATCH] rtw88: fix skb_under_panic in tx path Nick Owens
@ 2020-06-26  0:10 ` Thomas Pedersen
  2020-06-29  5:20   ` Tony Chuang
  0 siblings, 1 reply; 4+ messages in thread
From: Thomas Pedersen @ 2020-06-26  0:10 UTC (permalink / raw)
  To: Nick Owens; +Cc: linux-wireless, Yan-Hsuan Chuang

On 2020-06-25 13:18, Nick Owens wrote:
> hello :)

Hi Nick :)

> this change fixes a reliable crash on my thinkpad A485.
> 
> please note i have no prior experience doing kernel development or
> sending patches, and i'm not sure if this is a correct approach.

You probably want to submit patches with git-send-email. See 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

> From aa589182d30a0f99e1b3201ed4f3830e8af71dac Mon Sep 17 00:00:00 2001
> From: Nick Owens <mischief@offblast.org>
> Date: Thu, 25 Jun 2020 12:55:41 -0700
> Subject: [PATCH] rtw88: fix skb_under_panic in tx path
> 
> fixes the following panic on my thinkpad A485
> 
> Oops#1 Part3
> <0>[ 3743.881656] skbuff: skb_under_panic: text:000000005f69fd98
> len:208 put:48 head:000000009e2719e8 data:00000000bd3795e0 tail:0xc2
> end:0x2c0 dev:wlp2s0

skb->head and skb->data here are really far (0.5GB) apart. Maybe 
skb->data actually got corrupted earlier?

> diff --git a/drivers/net/wireless/realtek/rtw88/pci.c
> b/drivers/net/wireless/realtek/rtw88/pci.c
> index d735f3127fe8..21b3b268cb25 100644
> --- a/drivers/net/wireless/realtek/rtw88/pci.c
> +++ b/drivers/net/wireless/realtek/rtw88/pci.c
> @@ -741,6 +741,12 @@ static int rtw_pci_tx_write_data(struct rtw_dev 
> *rtwdev,
>  	else if (!avail_desc(ring->r.wp, ring->r.rp, ring->r.len))
>  		return -ENOSPC;
> 
> +	if (skb_headroom(skb) < chip->tx_pkt_desc_sz &&
> +	    pskb_expand_head(skb, chip->tx_pkt_desc_sz - skb_headroom(skb),
> 0, GFP_ATOMIC)) {
> +		dev_err(rtwdev->dev, "no headroom available");
> +		return -ENOMEM;
> +	}
> +

If it is a headroom issue, you can actually express the needed headroom 
needed by the driver in hw->extra_tx_headroom during init and avoid the 
pskb_expand_head() here.

-- 
thomas

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] rtw88: fix skb_under_panic in tx path
  2020-06-26  0:10 ` Thomas Pedersen
@ 2020-06-29  5:20   ` Tony Chuang
  2020-11-21  1:43     ` Brian Norris
  0 siblings, 1 reply; 4+ messages in thread
From: Tony Chuang @ 2020-06-29  5:20 UTC (permalink / raw)
  To: Thomas Pedersen, Nick Owens; +Cc: linux-wireless

> On 2020-06-25 13:18, Nick Owens wrote:
> > hello :)
> 
> Hi Nick :)
> 
> > this change fixes a reliable crash on my thinkpad A485.
> >
> > please note i have no prior experience doing kernel development or
> > sending patches, and i'm not sure if this is a correct approach.
> 
> You probably want to submit patches with git-send-email. See
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatch
> es
> 
> > From aa589182d30a0f99e1b3201ed4f3830e8af71dac Mon Sep 17 00:00:00
> 2001
> > From: Nick Owens <mischief@offblast.org>
> > Date: Thu, 25 Jun 2020 12:55:41 -0700
> > Subject: [PATCH] rtw88: fix skb_under_panic in tx path
> >
> > fixes the following panic on my thinkpad A485
> >
> > Oops#1 Part3
> > <0>[ 3743.881656] skbuff: skb_under_panic: text:000000005f69fd98
> > len:208 put:48 head:000000009e2719e8 data:00000000bd3795e0 tail:0xc2
> > end:0x2c0 dev:wlp2s0
> 
> skb->head and skb->data here are really far (0.5GB) apart. Maybe
> skb->data actually got corrupted earlier?
> 
> > diff --git a/drivers/net/wireless/realtek/rtw88/pci.c
> > b/drivers/net/wireless/realtek/rtw88/pci.c
> > index d735f3127fe8..21b3b268cb25 100644
> > --- a/drivers/net/wireless/realtek/rtw88/pci.c
> > +++ b/drivers/net/wireless/realtek/rtw88/pci.c
> > @@ -741,6 +741,12 @@ static int rtw_pci_tx_write_data(struct rtw_dev
> > *rtwdev,
> >  	else if (!avail_desc(ring->r.wp, ring->r.rp, ring->r.len))
> >  		return -ENOSPC;
> >
> > +	if (skb_headroom(skb) < chip->tx_pkt_desc_sz &&
> > +	    pskb_expand_head(skb, chip->tx_pkt_desc_sz - skb_headroom(skb),
> > 0, GFP_ATOMIC)) {
> > +		dev_err(rtwdev->dev, "no headroom available");
> > +		return -ENOMEM;
> > +	}
> > +
> 
> If it is a headroom issue, you can actually express the needed headroom
> needed by the driver in hw->extra_tx_headroom during init and avoid the
> pskb_expand_head() here.
> 

Looks like a headroom issue, but the driver already assigned headroom.
	max_tx_headroom = rtwdev->chip->tx_pkt_desc_sz;
	hw->extra_tx_headroom = max_tx_headroom;

Then I am not sure why this happens. Nick, can you help to dump_stack()
so we can see where is the skb from?

Yen-Hsuan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] rtw88: fix skb_under_panic in tx path
  2020-06-29  5:20   ` Tony Chuang
@ 2020-11-21  1:43     ` Brian Norris
  0 siblings, 0 replies; 4+ messages in thread
From: Brian Norris @ 2020-11-21  1:43 UTC (permalink / raw)
  To: Yan-Hsuan Chuang, Nick Owens; +Cc: Thomas Pedersen, linux-wireless

(Swapping out Yen-Hsuan's new email)

Necromancing an old thread, since it's still relevant, and I had it
sitting in my inbox to deal with. Now I have something useful to say!

On Mon, Jun 29, 2020 at 11:50 AM Tony Chuang <yhchuang@realtek.com> wrote:
> > On 2020-06-25 13:18, Nick Owens wrote:

> > > fixes the following panic on my thinkpad A485
> > >
> > > Oops#1 Part3
> > > <0>[ 3743.881656] skbuff: skb_under_panic: text:000000005f69fd98
> > > len:208 put:48 head:000000009e2719e8 data:00000000bd3795e0 tail:0xc2
> > > end:0x2c0 dev:wlp2s0
...
> > skb->head and skb->data here are really far (0.5GB) apart. Maybe
> > skb->data actually got corrupted earlier?

For the record, I've reproduced similar issues myself, and the problem
occurs when
(a) the initial SKB starts with minimal headroom (I find that many
SKBs come into mac80211 with plenty of headroom)
(b) the SKB participates in AMSDU aggregation
I've spotted a specific bug, which I'll point out below. But I remain
confused why in many cases the SKB ends up looking so corrupted.

...
> > If it is a headroom issue, you can actually express the needed headroom
> > needed by the driver in hw->extra_tx_headroom during init and avoid the
> > pskb_expand_head() here.
>
> Looks like a headroom issue, but the driver already assigned headroom.
>         max_tx_headroom = rtwdev->chip->tx_pkt_desc_sz;
>         hw->extra_tx_headroom = max_tx_headroom;
>
> Then I am not sure why this happens. Nick, can you help to dump_stack()
> so we can see where is the skb from?

That's not so easy, because of the layers and queueing involved, but
per the above, I've blamed mac80211's AMSDU aggregation. Specifically:

ieee80211_amsdu_aggregate() -> ieee80211_amsdu_prepare_head(), where
we pad out / expand the SKB to fit some additional AMSDU headers (and
later append additional data). But the padding function
(ieee80211_amsdu_realloc_pad()) accounts only for the 802.11 protocol
headroom, and not for the driver-specific headroom. So it chooses not
to expand the headroom, and instead eats into rtw88's space.

For such SKBs, they end up in the driver without sufficient headroom
-- thus, Nick's bug report.

NB: the seemingly-obvious fix (changing the headroom checks in
ieee80211_amsdu_realloc_pad()) does not seem to work, as I hit other
bugs along the way. Unfortunately, I haven't had the time to fix this
all myself properly, nor have I convinced Realtek to fix this
themselves. So in the meantime, Chrome OS is running with this:

https://chromium.googlesource.com/chromiumos/third_party/kernel/+/260a7d4939c323aebe80efc73610682ad2cb187a%5E%21/#F0

It's a similar idea.

We should of course fix the mac80211 bug, but I wonder if we also
deserve some patch similar to either the Chromium patch or Nick's
somewhere (perhaps with a loud warning, etc.), because it's much more
user friendly (in the face of similar future bugs) to do some
suboptimal memcpy()'s, etc., than to crash their systems

Brian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-21  1:43 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 20:18 [PATCH] rtw88: fix skb_under_panic in tx path Nick Owens
2020-06-26  0:10 ` Thomas Pedersen
2020-06-29  5:20   ` Tony Chuang
2020-11-21  1:43     ` Brian Norris

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.