All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] skb corruption and kernel panic at forwarding with fragmentation
@ 2016-01-06 19:15 Konstantin Khlebnikov
  2016-01-06 19:59 ` Cong Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Konstantin Khlebnikov @ 2016-01-06 19:15 UTC (permalink / raw)
  To: netdev, David Miller, Eric Dumazet, Linux Kernel Mailing List

I've got some of these:

[84408.314676] BUG: unable to handle kernel NULL pointer dereference
at           (null)
[84408.317324] IP: [<ffffffff81166e15>] put_page+0x5/0x50
[84408.319985] PGD 0
[84408.322583] Oops: 0000 [#1] SMP
[84408.325156] Modules linked in: ppp_mppe ppp_async ppp_generic slhc
8021q fuse nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace
sunrpc bridge stp llc xt_HL xt_TCPMSS xt_state w83627ehf hwmon_vid
snd_hda_codec_realtek snd_hda_codec_generic radeon snd_hda_codec_hdmi
snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm
snd_hda_core edac_core k10temp snd_timer snd drm_kms_helper soundcore
ath9k ttm ath9k_common ath9k_hw ath r8169 mii
[84408.336804] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.1.15-zurg #1
[84408.339839] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./RS880D, BIOS 080015  04/12/2011
[84408.342964] task: ffff880216d56f50 ti: ffff880216e04000 task.ti:
ffff880216e04000
[84408.346136] RIP: 0010:[<ffffffff81166e15>]  [<ffffffff81166e15>]
put_page+0x5/0x50
[84408.349301] RSP: 0018:ffff88021fcc37c0  EFLAGS: 00010216
[84408.352433] RAX: 0000000000000030 RBX: 0000000000000001 RCX: 0000000000000077
[84408.355602] RDX: ffff880213d8818e RSI: 0000000000000200 RDI: 0000000000000000
[84408.358765] RBP: ffff88021fcc37e8 R08: 0000000000000076 R09: ffff880216c01900
[84408.361885] R10: ffffea000859a840 R11: 0000000000000001 R12: ffff8802166a1300
[84408.364988] R13: ffff88021280d8c0 R14: ffff8802166a1300 R15: ffff88021280d410
[84408.368059] FS:  00007f9ada2de700(0000) GS:ffff88021fcc0000(0000)
knlGS:0000000000000000
[84408.371211] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[84408.374336] CR2: 0000000000000000 CR3: 0000000216575000 CR4: 00000000000006e0
[84408.377484] Stack:
[84408.380623]  ffffffff81576ac8 ffff88021fcc37e8 ffff8802166a1300
ffff8802166a1300
[84408.383843]  0000000000000000 ffff88021fcc3808 ffffffff81576b48
ffff88021fcc3808
[84408.387022]  0000000000006e00 ffff88021fcc3828 ffffffff81576cdd
0000000000006e00
[84408.390187] Call Trace:
[84408.393293]  <IRQ>
[84408.393323]  [<ffffffff81576ac8>] ? skb_release_data+0x78/0xd0
[84408.399488]  [<ffffffff81576b48>] skb_release_all+0x28/0x30
[84408.402553]  [<ffffffff81576cdd>] consume_skb+0x5d/0x80
[84408.405630]  [<ffffffff815d0d64>] ip_fragment+0x5c4/0x970
[84408.408676]  [<ffffffff815cf740>] ? ip_copy_metadata+0x160/0x160
[84408.411733]  [<ffffffff815d1711>] ip_finish_output+0x601/0x900
[84408.414788]  [<ffffffff815b6ed9>] ? nf_hook_slow+0x99/0x100
[84408.417828]  [<ffffffff815d2366>] ip_output+0x66/0xc0
[84408.420847]  [<ffffffff815d1110>] ? ip_fragment+0x970/0x970
[84408.423864]  [<ffffffff815cd683>] ip_forward_finish+0x73/0xa0
[84408.426864]  [<ffffffff815cda5f>] ip_forward+0x3af/0x490
[84408.429833]  [<ffffffff815cd610>] ? ip_frag_mem+0x50/0x50
[84408.432782]  [<ffffffff815cb701>] ip_rcv_finish+0x81/0x370
[84408.435778]  [<ffffffff815cc0b2>] ip_rcv+0x2a2/0x3c0
[84408.438780]  [<ffffffff815cb680>] ? inet_del_offload+0x40/0x40
[84408.441780]  [<ffffffff8158a623>] __netif_receive_skb_core+0x673/0x810
[84408.444785]  [<ffffffff8158a7d8>] __netif_receive_skb+0x18/0x60
[84408.447766]  [<ffffffff8158a843>] netif_receive_skb_internal+0x23/0x90
[84408.450739]  [<ffffffff8158a8cc>] netif_receive_skb_sk+0x1c/0x70
[84408.453726]  [<ffffffffa04a9e5c>] br_handle_frame_finish+0x27c/0x520 [bridge]
[84408.456774]  [<ffffffff8161dcc8>] ? ipv4_confirm+0xb8/0xe0
[84408.459787]  [<ffffffffa04aa261>] br_handle_frame+0x161/0x290 [bridge]
[84408.462803]  [<ffffffff815cbdb6>] ? ip_local_deliver+0x46/0xa0
[84408.465796]  [<ffffffff8158a2de>] __netif_receive_skb_core+0x32e/0x810
[84408.468822]  [<ffffffff8158a7d8>] __netif_receive_skb+0x18/0x60
[84408.471748]  [<ffffffff8158a843>] netif_receive_skb_internal+0x23/0x90
[84408.474615]  [<ffffffff815f6483>] ? tcp4_gro_complete+0x73/0x80
[84408.477378]  [<ffffffff8158a9bc>] napi_gro_complete+0x9c/0xe0
[84408.480045]  [<ffffffff8158b0a0>] dev_gro_receive+0x230/0x360
[84408.482675]  [<ffffffff8158b400>] napi_gro_receive+0x30/0x100
[84408.485240]  [<ffffffffa000e8d6>] rtl8169_poll+0x2c6/0x6b0 [r8169]
[84408.487766]  [<ffffffff8158ad4a>] net_rx_action+0x1fa/0x320
[84408.490241]  [<ffffffff81090a1b>] __do_softirq+0x10b/0x2d0
[84408.492672]  [<ffffffff81090db5>] irq_exit+0xd5/0xe0
[84408.495072]  [<ffffffff817452d8>] do_IRQ+0x58/0xf0
[84408.497463]  [<ffffffff8174356e>] common_interrupt+0x6e/0x6e
[84408.499879]  <EOI>
[84408.499909]  [<ffffffff8104c726>] ? native_safe_halt+0x6/0x10
[84408.504697]  [<ffffffff810f01be>] ? tick_broadcast_oneshot_control+0xbe/0x200
[84408.507126]  [<ffffffff8100e98e>] default_idle+0x1e/0xc0
[84408.509516]  [<ffffffff8100ea9e>] amd_e400_idle+0x6e/0xf0
[84408.511879]  [<ffffffff8100f51f>] arch_cpu_idle+0xf/0x20
[84408.514181]  [<ffffffff810c4c37>] cpu_startup_entry+0x327/0x3a0
[84408.516456]  [<ffffffff810eea3c>] ? clockevents_register_device+0xec/0x1d0
[84408.518760]  [<ffffffff8103ba08>] start_secondary+0x138/0x160
[84408.521066] Code: 48 89 d7 e8 2e f7 ff ff e9 a1 fe ff ff 48 89 d7
e8 51 f7 ff ff e9 94 fe ff ff 66 90 66 2e 0f 1f 84 00 00 00 00 00 66
66 66 66 90 <48> f7 07 00 c0 00 00 55 48 89 e5 75 1e 8b 47 1c 85 c0 74
27 f0
[84408.526216] RIP  [<ffffffff81166e15>] put_page+0x5/0x50
[84408.528705]  RSP <ffff88021fcc37c0>
[84408.531178] CR2: 0000000000000000

Looks like this happens because ip_options_fragment() relies on
correct ip options length in ip control block in skb. But in
ip_finish_output_gso() control block in segments is reused by
skb_gso_segment(). following ip_fragment() sees some garbage.

In my case there was no ip options but length becomes non-zero and
ip_options_fragment() picked some bytes from payload and decides to
fill huge range with IPOPT_NOOP (1). One of that ones flipped nr_frags
in skb_shared_info at the end of data =)

Here is quick hack: just make room for ip control block in gso control block.

--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3316,6 +3316,7 @@ static inline struct sec_path
*skb_sec_path(struct sk_buff *skb)
  * Keeps track of level of encapsulation of network headers.
  */
 struct skb_gso_cb {
+ char pad[32]; /* inet_skb_parm lives here */
  int mac_offset;
  int encap_level;
  __u16 csum_start;

And debug which prevents kernel crash too.

--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -215,6 +215,10 @@ void ip_options_fragment(struct sk_buff *skb)
  int  l = opt->optlen;
  int  optlen;

+ const struct iphdr *iph = ip_hdr(skb);
+ l = iph->ihl * 4 - sizeof(struct iphdr);
+ WARN(opt->optlen != l, "%s %d != %d\n", __func__, opt->optlen, l);
+
  while (l > 0) {
  switch (*optptr) {
  case IPOPT_END:

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-01-07 21:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-06 19:15 [BUG] skb corruption and kernel panic at forwarding with fragmentation Konstantin Khlebnikov
2016-01-06 19:59 ` Cong Wang
2016-01-06 20:11   ` Konstantin Khlebnikov
2016-01-06 21:05     ` Thadeu Lima de Souza Cascardo
2016-01-06 22:03       ` Florian Westphal
2016-01-06 23:49         ` Florian Westphal
2016-01-07 11:00           ` Konstantin Khlebnikov
2016-01-07 11:38             ` Konstantin Khlebnikov
2016-01-07 11:59               ` Eric Dumazet
2016-01-07 12:04                 ` Konstantin Khlebnikov
2016-01-07 12:54                   ` Eric Dumazet
2016-01-07 19:35                     ` Konstantin Khlebnikov
2016-01-07 19:47                       ` Eric Dumazet
2016-01-07 12:03             ` Florian Westphal
2016-01-07 18:43           ` [PATCH] net: prevent corruption of skb when using skb_gso_segment Thadeu Lima de Souza Cascardo
2016-01-07 19:31             ` Florian Westphal
2016-01-07 21:16               ` [PATCH v2] " Thadeu Lima de Souza Cascardo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.