possible stack corruption in icmp_send (__stack_chk_fail)

* possible stack corruption in icmp_send (__stack_chk_fail)
@ 2021-02-17 18:12 Jason A. Donenfeld
  2021-02-17 22:27 ` Willem de Bruijn
  0 siblings, 1 reply; 6+ messages in thread
From: Jason A. Donenfeld @ 2021-02-17 18:12 UTC (permalink / raw)
  To: Netdev, Willem de Bruijn; +Cc: LKML

Hi Netdev & Willem,

I've received a report of stack corruption -- via the stack protector
check -- in icmp_send. I was sent a vmcore, and was able to extract
the OOPS from there. However, I've been unable to produce the bug and
I don't see where it'd be in the code. That might point to a more
sinister problem, or I'm simply just not seeing it. Apparently the
reporter reproduces it every 40 or so minutes, and has seen it happen
since at least ~5.10. Willem - I'm emailing you because it seems like
you were making a lot of changes to the icmp code around then, and
perhaps you have an intuition. For example, some of the error handling
code takes a pointer to a stack buffer (_objh and such), and maybe
that's problematic? I'm not quite sure. The vmcore, along with the
various kernel binaries I hunted down are here:
https://data.zx2c4.com/icmp_send-crash-e03b4a42-706a-43bf-bc40-1f15966b3216.tar.xz
. The extracted dmesg follows below, in case you or anyone has a
pointer. I've been staring at this for a while and don't see it.

Jason

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted
in: __icmp_send+0x5bd/0x5c0
CPU: 0 PID: 959 Comm: kworker/0:2 Kdump: loaded Not tainted
5.11.0-051100-lowlatency #202102142330
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
Workqueue: wg-crypt-wg0 wg_packet_decrypt_worker [wireguard]
Call Trace:
 <IRQ>
 show_stack+0x52/0x58
 dump_stack+0x70/0x8b
 panic+0x108/0x2ea
 ? ip_push_pending_frames+0x42/0x90
 ? __icmp_send+0x5bd/0x5c0
 __stack_chk_fail+0x14/0x20
 __icmp_send+0x5bd/0x5c0
 icmp_ndo_send+0x148/0x160
 wg_xmit+0x359/0x450 [wireguard]
 ? harmonize_features+0x19/0x80
 xmit_one.constprop.0+0x9f/0x190
 dev_hard_start_xmit+0x43/0x90
 sch_direct_xmit+0x11d/0x340
 __qdisc_run+0x66/0xc0
 __dev_xmit_skb+0xd5/0x340
 __dev_queue_xmit+0x32b/0x4d0
 ? nf_conntrack_double_lock.constprop.0+0x97/0x140 [nf_conntrack]
 dev_queue_xmit+0x10/0x20
 neigh_connected_output+0xcb/0xf0
 ip_finish_output2+0x17f/0x470
 __ip_finish_output+0x9b/0x140
 ? ipv4_confirm+0x4a/0x80 [nf_conntrack]
 ip_finish_output+0x2d/0xb0
 ip_output+0x78/0x110
 ? __ip_finish_output+0x140/0x140
 ip_forward_finish+0x58/0x90
 ip_forward+0x40a/0x4d0
 ? ip4_key_hashfn+0xb0/0xb0
 ip_sublist_rcv_finish+0x3d/0x50
 ip_list_rcv_finish.constprop.0+0x163/0x190
 ip_sublist_rcv+0x37/0xb0
 ? ip_rcv_finish_core.constprop.0+0x310/0x310
 ip_list_rcv+0xf5/0x120
 __netif_receive_skb_list_core+0x228/0x250
 __netif_receive_skb_list+0x102/0x170
 ? dev_gro_receive+0x1b5/0x370
 netif_receive_skb_list_internal+0xca/0x190
 napi_complete_done+0x7a/0x1a0
 wg_packet_rx_poll+0x384/0x400 [wireguard]
 napi_poll+0x92/0x200
 net_rx_action+0xb8/0x1c0
 __do_softirq+0xce/0x2b3
 asm_call_irq_on_stack+0x12/0x20
 </IRQ>
 do_softirq_own_stack+0x3d/0x50
 do_softirq+0x66/0x80
 __local_bh_enable_ip+0x62/0x70
 _raw_spin_unlock_bh+0x1e/0x20
 wg_packet_decrypt_worker+0xf6/0x190 [wireguard]
 process_one_work+0x217/0x3e0
 worker_thread+0x4d/0x350
 ? rescuer_thread+0x390/0x390
 kthread+0x145/0x170
 ? __kthread_bind_mask+0x70/0x70
 ret_from_fork+0x22/0x30
Kernel Offset: 0x2000000 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)

^ permalink raw reply	[flat|nested] 6+ messages in thread