netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* System crash on Ubuntu 18, in netlink code when using iptables / netfilter
@ 2020-11-30 19:38 Yuri Lipnesh
  2020-11-30 19:58 ` Florian Westphal
  0 siblings, 1 reply; 5+ messages in thread
From: Yuri Lipnesh @ 2020-11-30 19:38 UTC (permalink / raw)
  To: netfilter-devel

Linux system crashed

[    0.000000] Linux version 5.4.0-54-generic (buildd@lcy01-amd64-008) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020 (Ubuntu 5.4.0-54.60~18.04.1-generic 5.4.65)
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-54-generic root=UUID=11885fd3-b840-4c9b-a500-532c73ac952a ro find_preseed=/preseed.cfg auto noprompt priority=critical locale=en_US quiet crashkernel=512M-:192M

…
[  156.321147] TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised.
[  177.519159] general protection fault: 0000 [#1] SMP PTI
[  177.519737] CPU: 5 PID: 18484 Comm: worker-1 Kdump: loaded Not tainted 5.4.0-54-generic #60~18.04.1-Ubuntu
[  177.519742] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
[  177.519814] RIP: 0010:dev_hard_start_xmit+0x38/0x200
[  177.519827] Code: 55 41 54 53 48 83 ec 20 48 85 ff 48 89 55 c8 48 89 4d b8 0f 84 c1 01 00 00 48 8d 86 90 00 00 00 48 89 fb 49 89 f4 48 89 45 c0 <4c> 8b 2b 48 c7 c0 d0 f2 04 8f 48 c7 03 00 00 00 00 48 8b 00 4d 85
[  177.519829] RSP: 0018:ffffbc6d0609b5e8 EFLAGS: 00010286
[  177.519833] RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff95cf4bcfe800
[  177.519835] RDX: 0000000000000000 RSI: ffff95cf4bcfe800 RDI: 0000000000000286
[  177.519837] RBP: ffffbc6d0609b630 R08: ffff95cf6a190ec8 R09: ffff95cf4a2f7438
[  177.519839] R10: ffffbc6d0609b6d0 R11: ffff95cf49d4d180 R12: ffff95cf51a5f000
[  177.519841] R13: dead000000000100 R14: 000000000000009c R15: ffff95d02996b400
[  177.519844] FS:  00007ff394cdfb20(0000) GS:ffff95d035d40000(0000) knlGS:0000000000000000
[  177.519846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  177.519848] CR2: 00007fb4a9c2d000 CR3: 00000001049fa004 CR4: 00000000003606e0
[  177.519908] Call Trace:
[  177.519917]  __dev_queue_xmit+0x719/0x920
[  177.519930]  ? ctnetlink_conntrack_event+0x8c/0x5e0 [nf_conntrack_netlink]
[  177.519934]  dev_queue_xmit+0x10/0x20
[  177.519937]  ? dev_queue_xmit+0x10/0x20
[  177.519940]  ip_finish_output2+0x304/0x5a0
[  177.519944]  ? conntrack_mt_v3+0x20/0x30 [xt_conntrack]
[  177.519947]  __ip_finish_output+0xfa/0x1c0
[  177.519949]  ? __ip_finish_output+0xfa/0x1c0
[  177.519952]  ip_finish_output+0x2c/0xa0
[  177.519954]  ip_output+0x6d/0xe0
[  177.519957]  ? __ip_finish_output+0x1c0/0x1c0
[  177.519960]  ip_forward_finish+0x57/0x90
[  177.519963]  ip_forward+0x38c/0x480
[  177.519967]  ? ip4_key_hashfn+0xc0/0xc0
[  177.519970]  ip_rcv_finish+0x84/0xa0
[  177.519973]  nf_reinject+0x18e/0x1e0
[  177.519980]  nfqnl_reinject+0x50/0x60 [nfnetlink_queue]
[  177.519984]  nfqnl_recv_verdict+0x310/0x4c0 [nfnetlink_queue]
[  177.519990]  nfnetlink_rcv_msg+0x165/0x290 [nfnetlink]
[  177.520000]  ? __switch_to_asm+0x34/0x70
[  177.520002]  ? __switch_to_asm+0x40/0x70
[  177.520005]  ? __switch_to_asm+0x34/0x70
[  177.520008]  ? apic_timer_interrupt+0xa/0x20
[  177.520013]  ? nfnetlink_net_exit_batch+0x70/0x70 [nfnetlink]
[  177.520016]  netlink_rcv_skb+0x51/0x120
[  177.520021]  nfnetlink_rcv+0x88/0x145 [nfnetlink]
[  177.520024]  netlink_unicast+0x1a4/0x250
[  177.520027]  netlink_sendmsg+0x2eb/0x3f0
[  177.520032]  sock_sendmsg+0x63/0x70
[  177.520036]  ____sys_sendmsg+0x200/0x280
[  177.520041]  ___sys_sendmsg+0x88/0xd0
[  177.520047]  ? __wake_up+0x13/0x20
[  177.520052]  ? fput+0x13/0x20
[  177.520055]  ? __sys_recvfrom+0x14b/0x160
[  177.520058]  ? sock_poll+0x79/0xb0
[  177.520061]  __sys_sendmsg+0x63/0xa0
[  177.520063]  ? __sys_sendmsg+0x63/0xa0
[  177.520067]  __x64_sys_sendmsg+0x1f/0x30
[  177.520072]  do_syscall_64+0x57/0x190
[  177.520075]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  177.520079] RIP: 0033:0x7ff39660c879
[  177.520083] Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 4d d3 ff ff 41 54 b8 02 00 00 00 49 89 f4 be 00 08 08 00 55
[  177.520085] RSP: 002b:00007ff394cddaa8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  177.520089] RAX: ffffffffffffffda RBX: 00007ff394cdfb20 RCX: 00007ff39660c879
[  177.520091] RDX: 0000000000000000 RSI: 00007ff394cddb08 RDI: 0000000000000022
[  177.520092] RBP: 0000000000000022 R08: 0000000000000000 R09: 0000000000000000
[  177.520094] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002e
[  177.520095] R13: 0000000000000022 R14: 00007ff394cddb08 R15: 00000000fa000000
[  177.520098] Modules linked in: nfnetlink_queue xt_NFQUEUE ipt_rpfilter xt_multiport xt_set iptable_raw ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel vxlan ip6_udp_tunnel udp_tunnel ipt_REJECT nf_reject_ipv4 ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_mangle xt_comment xt_mark rfcomm veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc intel_rapl_msr intel_rapl_common crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper rapl bnep vmw_balloon aufs snd_ens1371 snd_ac97_codec gameport ac97_bus input_leds snd_pcm joydev serio_raw snd_seq_midi snd_seq_midi_event snd_rawmidi btusb btrtl btbcm snd_seq btintel bluetooth snd_seq_device snd_timer ecdh_generic ecc snd soundcore overlay mac_hid vmw_vsock_vmci_transport vsock vmw_vmci sch_fq_codel vmwgfx ttm
[  177.520148]  drm_kms_helper drm fb_sys_fops syscopyarea sysfillrect sysimgblt parport_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid mptspi mptscsih mptbase ahci psmouse e1000 libahci scsi_transport_spi i2c_piix4 pata_acpi


Two products Calico and Aqua security use iptables /netfilter on that system

Regards,
Yuri Lipnesh



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: System crash on Ubuntu 18, in netlink code when using iptables / netfilter
  2020-11-30 19:38 System crash on Ubuntu 18, in netlink code when using iptables / netfilter Yuri Lipnesh
@ 2020-11-30 19:58 ` Florian Westphal
  2020-12-03 17:00   ` Yuri Lipnesh
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Westphal @ 2020-11-30 19:58 UTC (permalink / raw)
  To: Yuri Lipnesh; +Cc: netfilter-devel, stable

Yuri Lipnesh <yuri.lipnesh@gmail.com> wrote:
> Linux system crashed
> 
> [    0.000000] Linux version 5.4.0-54-generic (buildd@lcy01-amd64-008) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020 (Ubuntu 5.4.0-54.60~18.04.1-generic 5.4.65)
> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-54-generic root=UUID=11885fd3-b840-4c9b-a500-532c73ac952a ro find_preseed=/preseed.cfg auto noprompt priority=critical locale=en_US quiet crashkernel=512M-:192M
> 
> …
> [  156.321147] TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised.
> [  177.519159] general protection fault: 0000 [#1] SMP PTI
> [  177.519737] CPU: 5 PID: 18484 Comm: worker-1 Kdump: loaded Not tainted 5.4.0-54-generic #60~18.04.1-Ubuntu
> [  177.519742] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
> [  177.519814] RIP: 0010:dev_hard_start_xmit+0x38/0x200
> [  177.519827] Code: 55 41 54 53 48 83 ec 20 48 85 ff 48 89 55 c8 48 89 4d b8 0f 84 c1 01 00 00 48 8d 86 90 00 00 00 48 89 fb 49 89 f4 48 89 45 c0 <4c> 8b 2b 48 c7 c0 d0 f2 04 8f 48 c7 03 00 00 00 00 48 8b 00 4d 85
> [  177.519829] RSP: 0018:ffffbc6d0609b5e8 EFLAGS: 00010286
> [  177.519833] RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff95cf4bcfe800
> [  177.519835] RDX: 0000000000000000 RSI: ffff95cf4bcfe800 RDI: 0000000000000286
> [  177.519837] RBP: ffffbc6d0609b630 R08: ffff95cf6a190ec8 R09: ffff95cf4a2f7438
> [  177.519839] R10: ffffbc6d0609b6d0 R11: ffff95cf49d4d180 R12: ffff95cf51a5f000
> [  177.519841] R13: dead000000000100 R14: 000000000000009c R15: ffff95d02996b400
> [  177.519844] FS:  00007ff394cdfb20(0000) GS:ffff95d035d40000(0000) knlGS:0000000000000000
> [  177.519846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  177.519848] CR2: 00007fb4a9c2d000 CR3: 00000001049fa004 CR4: 00000000003606e0
> [  177.519908] Call Trace:
> [  177.519917]  __dev_queue_xmit+0x719/0x920
> [  177.519930]  ? ctnetlink_conntrack_event+0x8c/0x5e0 [nf_conntrack_netlink]

Can you reproduce this on 5.7 or later, or with following patches
backported to 5.4.y?

 dd3cc111f2e3220ddc9c4ab17f13dc97759b5163
 119e52e664c57d5f7c0174dc2b3a296b1e40591d
 af370ab36fcd19f04e3408c402608e7e56e6f188
 28f715b9e6dd7cbf07c2aea913fea7c87a56a3b5

The series fixed nfqueue reference counting.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: System crash on Ubuntu 18, in netlink code when using iptables / netfilter
  2020-11-30 19:58 ` Florian Westphal
@ 2020-12-03 17:00   ` Yuri Lipnesh
  2021-08-30 15:43     ` System crash in netfilter 5.10.25 Yuri Lipnesh
  0 siblings, 1 reply; 5+ messages in thread
From: Yuri Lipnesh @ 2020-12-03 17:00 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, stable

Seems that upgrade to Linux 5.7 solved the problem, we will run more tests.
Thank you,
Yuri 

> On Nov 30, 2020, at 2:58 PM, Florian Westphal <fw@strlen.de> wrote:
> 
> Yuri Lipnesh <yuri.lipnesh@gmail.com> wrote:
>> Linux system crashed
>> 
>> [    0.000000] Linux version 5.4.0-54-generic (buildd@lcy01-amd64-008) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020 (Ubuntu 5.4.0-54.60~18.04.1-generic 5.4.65)
>> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-54-generic root=UUID=11885fd3-b840-4c9b-a500-532c73ac952a ro find_preseed=/preseed.cfg auto noprompt priority=critical locale=en_US quiet crashkernel=512M-:192M
>> 
>> …
>> [  156.321147] TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised.
>> [  177.519159] general protection fault: 0000 [#1] SMP PTI
>> [  177.519737] CPU: 5 PID: 18484 Comm: worker-1 Kdump: loaded Not tainted 5.4.0-54-generic #60~18.04.1-Ubuntu
>> [  177.519742] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
>> [  177.519814] RIP: 0010:dev_hard_start_xmit+0x38/0x200
>> [  177.519827] Code: 55 41 54 53 48 83 ec 20 48 85 ff 48 89 55 c8 48 89 4d b8 0f 84 c1 01 00 00 48 8d 86 90 00 00 00 48 89 fb 49 89 f4 48 89 45 c0 <4c> 8b 2b 48 c7 c0 d0 f2 04 8f 48 c7 03 00 00 00 00 48 8b 00 4d 85
>> [  177.519829] RSP: 0018:ffffbc6d0609b5e8 EFLAGS: 00010286
>> [  177.519833] RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff95cf4bcfe800
>> [  177.519835] RDX: 0000000000000000 RSI: ffff95cf4bcfe800 RDI: 0000000000000286
>> [  177.519837] RBP: ffffbc6d0609b630 R08: ffff95cf6a190ec8 R09: ffff95cf4a2f7438
>> [  177.519839] R10: ffffbc6d0609b6d0 R11: ffff95cf49d4d180 R12: ffff95cf51a5f000
>> [  177.519841] R13: dead000000000100 R14: 000000000000009c R15: ffff95d02996b400
>> [  177.519844] FS:  00007ff394cdfb20(0000) GS:ffff95d035d40000(0000) knlGS:0000000000000000
>> [  177.519846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  177.519848] CR2: 00007fb4a9c2d000 CR3: 00000001049fa004 CR4: 00000000003606e0
>> [  177.519908] Call Trace:
>> [  177.519917]  __dev_queue_xmit+0x719/0x920
>> [  177.519930]  ? ctnetlink_conntrack_event+0x8c/0x5e0 [nf_conntrack_netlink]
> 
> Can you reproduce this on 5.7 or later, or with following patches
> backported to 5.4.y?
> 
> dd3cc111f2e3220ddc9c4ab17f13dc97759b5163
> 119e52e664c57d5f7c0174dc2b3a296b1e40591d
> af370ab36fcd19f04e3408c402608e7e56e6f188
> 28f715b9e6dd7cbf07c2aea913fea7c87a56a3b5
> 
> The series fixed nfqueue reference counting.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: System crash in netfilter  5.10.25
  2020-12-03 17:00   ` Yuri Lipnesh
@ 2021-08-30 15:43     ` Yuri Lipnesh
  2021-08-30 20:51       ` Florian Westphal
  0 siblings, 1 reply; 5+ messages in thread
From: Yuri Lipnesh @ 2021-08-30 15:43 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, stable

Hello Florian,

I need assistance on this one. Our customer system 5.10.25-flatcar crashed with following trace

Aug 26 10:26:32.686733 amc-k8sdevsl01-worker-lx13 kernel: ------------[ cut here ]------------
Aug 26 10:26:32.686855 amc-k8sdevsl01-worker-lx13 kernel: refcount_t: underflow; use-after-free.
Aug 26 10:26:32.686877 amc-k8sdevsl01-worker-lx13 kernel: WARNING: CPU: 4 PID: 2422635 at lib/refcount.c:28 refcount_warn_saturat>
Aug 26 10:26:32.686930 amc-k8sdevsl01-worker-lx13 kernel: Modules linked in: binfmt_misc nfnetlink_queue xt_NFQUEUE xt_multiport >
Aug 26 10:26:32.689906 amc-k8sdevsl01-worker-lx13 kernel:  dm_region_hash dm_log dm_mod
Aug 26 10:26:32.690398 amc-k8sdevsl01-worker-lx13 kernel: CPU: 4 PID: 2422635 Comm: worker-1 Not tainted 5.10.25-flatcar #1
Aug 26 10:26:32.690526 amc-k8sdevsl01-worker-lx13 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Refer>
Aug 26 10:26:32.691653 amc-k8sdevsl01-worker-lx13 kernel: RIP: 0010:refcount_warn_saturate+0xa6/0xf0
Aug 26 10:26:32.691720 amc-k8sdevsl01-worker-lx13 kernel: Code: 05 3c 1d 40 01 01 e8 81 46 38 00 0f 0b c3 80 3d 2a 1d 40 01 00 75>
Aug 26 10:26:32.691747 amc-k8sdevsl01-worker-lx13 kernel: RSP: 0018:ffffa3a0c3627938 EFLAGS: 00010282
Aug 26 10:26:32.692385 amc-k8sdevsl01-worker-lx13 kernel: RAX: 0000000000000000 RBX: ffff8c011b14fa00 RCX: 0000000000000027
Aug 26 10:26:32.692422 amc-k8sdevsl01-worker-lx13 kernel: RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff8c045d918b08
Aug 26 10:26:32.692446 amc-k8sdevsl01-worker-lx13 kernel: RBP: ffff8c011b14fa00 R08: ffff8c045d918b00 R09: ffffa3a0c3627750
Aug 26 10:26:32.693526 amc-k8sdevsl01-worker-lx13 kernel: R10: 0000000000000001 R11: 0000000000000001 R12: ffff8c011b14fa30
Aug 26 10:26:32.693584 amc-k8sdevsl01-worker-lx13 kernel: R13: 0000000000000002 R14: ffff8bfda3b43180 R15: ffff8c00cddb3a00
Aug 26 10:26:32.693615 amc-k8sdevsl01-worker-lx13 kernel: FS:  00007ff7a2331b38(0000) GS:ffff8c045d900000(0000) knlGS:00000000000>
Aug 26 10:26:32.693649 amc-k8sdevsl01-worker-lx13 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 26 10:26:32.694304 amc-k8sdevsl01-worker-lx13 kernel: CR2: 00007ff79ac17a28 CR3: 00000001ee34e003 CR4: 00000000007706e0
Aug 26 10:26:32.694334 amc-k8sdevsl01-worker-lx13 kernel: PKRU: 55555554
Aug 26 10:26:32.694351 amc-k8sdevsl01-worker-lx13 kernel: Call Trace:
Aug 26 10:26:32.694370 amc-k8sdevsl01-worker-lx13 kernel:  nf_queue_entry_release_refs+0x82/0xa0
Aug 26 10:26:32.695381 amc-k8sdevsl01-worker-lx13 kernel:  nf_reinject+0x6f/0x1a0
Aug 26 10:26:32.695404 amc-k8sdevsl01-worker-lx13 kernel:  0xffffffffc0857980
Aug 26 10:26:32.695425 amc-k8sdevsl01-worker-lx13 kernel:  nfnetlink_unicast+0x1f1/0x420 [nfnetlink]
Aug 26 10:26:32.695441 amc-k8sdevsl01-worker-lx13 kernel:  ? cred_has_capability+0x7f/0x120
Aug 26 10:26:32.695457 amc-k8sdevsl01-worker-lx13 kernel:  ? nfnetlink_unicast+0xa0/0x420 [nfnetlink]
Aug 26 10:26:32.695475 amc-k8sdevsl01-worker-lx13 kernel:  netlink_rcv_skb+0x50/0x100
Aug 26 10:26:32.696440 amc-k8sdevsl01-worker-lx13 kernel:  nfnetlink_subsys_register+0x789/0x869 [nfnetlink]
Aug 26 10:26:32.696465 amc-k8sdevsl01-worker-lx13 kernel:  netlink_unicast+0x191/0x230
Aug 26 10:26:32.696492 amc-k8sdevsl01-worker-lx13 kernel:  netlink_sendmsg+0x243/0x480
Aug 26 10:26:32.696513 amc-k8sdevsl01-worker-lx13 kernel:  sock_sendmsg+0x5e/0x60
Aug 26 10:26:32.696529 amc-k8sdevsl01-worker-lx13 kernel:  ____sys_sendmsg+0x1f3/0x260
Aug 26 10:26:32.697288 amc-k8sdevsl01-worker-lx13 kernel:  ? copy_msghdr_from_user+0x5c/0x90
Aug 26 10:26:32.697309 amc-k8sdevsl01-worker-lx13 kernel:  ? _cond_resched+0x15/0x30
Aug 26 10:26:32.697329 amc-k8sdevsl01-worker-lx13 kernel:  ___sys_sendmsg+0x81/0xc0
Aug 26 10:26:32.697348 amc-k8sdevsl01-worker-lx13 kernel:  ? do_lock_file_wait+0x6e/0xe0
Aug 26 10:26:32.697370 amc-k8sdevsl01-worker-lx13 kernel:  ? _cond_resched+0x15/0x30
Aug 26 10:26:32.698946 amc-k8sdevsl01-worker-lx13 kernel:  ? fcntl_setlk+0x1a5/0x2d0
Aug 26 10:26:32.698988 amc-k8sdevsl01-worker-lx13 kernel:  __sys_sendmsg+0x59/0xa0
Aug 26 10:26:32.699005 amc-k8sdevsl01-worker-lx13 kernel:  do_syscall_64+0x33/0x40
Aug 26 10:26:32.699020 amc-k8sdevsl01-worker-lx13 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Aug 26 10:26:32.699039 amc-k8sdevsl01-worker-lx13 kernel: RIP: 0033:0x7ff7ab1283ad
Aug 26 10:26:32.699071 amc-k8sdevsl01-worker-lx13 kernel: Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2>
Aug 26 10:26:32.699090 amc-k8sdevsl01-worker-lx13 kernel: RSP: 002b:00007ff7a232f9f8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
Aug 26 10:26:32.699505 amc-k8sdevsl01-worker-lx13 kernel: RAX: ffffffffffffffda RBX: 00007ff7a2331b38 RCX: 00007ff7ab1283ad
Aug 26 10:26:32.699534 amc-k8sdevsl01-worker-lx13 kernel: RDX: 0000000000000000 RSI: 00007ff7a232fa48 RDI: 0000000000000078
Aug 26 10:26:09.088408 amc-k8sdevsl01-worker-lx13 kernel: SELinux:  Class xdp_socket not defined in policy.

Is there a fix available for that crash?

Thank you,
Yuri


> On Dec 3, 2020, at 12:00 PM, Yuri Lipnesh <yuri.lipnesh@gmail.com> wrote:
> 
> Seems that upgrade to Linux 5.7 solved the problem, we will run more tests.
> Thank you,
> Yuri 
> 
>> On Nov 30, 2020, at 2:58 PM, Florian Westphal <fw@strlen.de> wrote:
>> 
>> Yuri Lipnesh <yuri.lipnesh@gmail.com> wrote:
>>> Linux system crashed
>>> 
>>> [    0.000000] Linux version 5.4.0-54-generic (buildd@lcy01-amd64-008) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #60~18.04.1-Ubuntu SMP Fri Nov 6 17:25:16 UTC 2020 (Ubuntu 5.4.0-54.60~18.04.1-generic 5.4.65)
>>> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-54-generic root=UUID=11885fd3-b840-4c9b-a500-532c73ac952a ro find_preseed=/preseed.cfg auto noprompt priority=critical locale=en_US quiet crashkernel=512M-:192M
>>> 
>>> …
>>> [  156.321147] TCP: eth0: Driver has suspect GRO implementation, TCP performance may be compromised.
>>> [  177.519159] general protection fault: 0000 [#1] SMP PTI
>>> [  177.519737] CPU: 5 PID: 18484 Comm: worker-1 Kdump: loaded Not tainted 5.4.0-54-generic #60~18.04.1-Ubuntu
>>> [  177.519742] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020
>>> [  177.519814] RIP: 0010:dev_hard_start_xmit+0x38/0x200
>>> [  177.519827] Code: 55 41 54 53 48 83 ec 20 48 85 ff 48 89 55 c8 48 89 4d b8 0f 84 c1 01 00 00 48 8d 86 90 00 00 00 48 89 fb 49 89 f4 48 89 45 c0 <4c> 8b 2b 48 c7 c0 d0 f2 04 8f 48 c7 03 00 00 00 00 48 8b 00 4d 85
>>> [  177.519829] RSP: 0018:ffffbc6d0609b5e8 EFLAGS: 00010286
>>> [  177.519833] RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff95cf4bcfe800
>>> [  177.519835] RDX: 0000000000000000 RSI: ffff95cf4bcfe800 RDI: 0000000000000286
>>> [  177.519837] RBP: ffffbc6d0609b630 R08: ffff95cf6a190ec8 R09: ffff95cf4a2f7438
>>> [  177.519839] R10: ffffbc6d0609b6d0 R11: ffff95cf49d4d180 R12: ffff95cf51a5f000
>>> [  177.519841] R13: dead000000000100 R14: 000000000000009c R15: ffff95d02996b400
>>> [  177.519844] FS:  00007ff394cdfb20(0000) GS:ffff95d035d40000(0000) knlGS:0000000000000000
>>> [  177.519846] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  177.519848] CR2: 00007fb4a9c2d000 CR3: 00000001049fa004 CR4: 00000000003606e0
>>> [  177.519908] Call Trace:
>>> [  177.519917]  __dev_queue_xmit+0x719/0x920
>>> [  177.519930]  ? ctnetlink_conntrack_event+0x8c/0x5e0 [nf_conntrack_netlink]
>> 
>> Can you reproduce this on 5.7 or later, or with following patches
>> backported to 5.4.y?
>> 
>> dd3cc111f2e3220ddc9c4ab17f13dc97759b5163
>> 119e52e664c57d5f7c0174dc2b3a296b1e40591d
>> af370ab36fcd19f04e3408c402608e7e56e6f188
>> 28f715b9e6dd7cbf07c2aea913fea7c87a56a3b5
>> 
>> The series fixed nfqueue reference counting.
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: System crash in netfilter  5.10.25
  2021-08-30 15:43     ` System crash in netfilter 5.10.25 Yuri Lipnesh
@ 2021-08-30 20:51       ` Florian Westphal
  0 siblings, 0 replies; 5+ messages in thread
From: Florian Westphal @ 2021-08-30 20:51 UTC (permalink / raw)
  To: Yuri Lipnesh; +Cc: Florian Westphal, netfilter-devel, stable

Yuri Lipnesh <yuri.lipnesh@gmail.com> wrote:
> Hello Florian,
> 
> I need assistance on this one. Our customer system 5.10.25-flatcar crashed with following trace
> 
> Aug 26 10:26:32.686733 amc-k8sdevsl01-worker-lx13 kernel: ------------[ cut here ]------------
> Aug 26 10:26:32.686855 amc-k8sdevsl01-worker-lx13 kernel: refcount_t: underflow; use-after-free.
> Aug 26 10:26:32.686877 amc-k8sdevsl01-worker-lx13 kernel: WARNING: CPU: 4 PID: 2422635 at lib/refcount.c:28 refcount_warn_saturat>
> Aug 26 10:26:32.686930 amc-k8sdevsl01-worker-lx13 kernel: Modules linked in: binfmt_misc nfnetlink_queue xt_NFQUEUE xt_multiport >
> Aug 26 10:26:32.689906 amc-k8sdevsl01-worker-lx13 kernel:  dm_region_hash dm_log dm_mod
> Aug 26 10:26:32.690398 amc-k8sdevsl01-worker-lx13 kernel: CPU: 4 PID: 2422635 Comm: worker-1 Not tainted 5.10.25-flatcar #1
> Aug 26 10:26:32.690526 amc-k8sdevsl01-worker-lx13 kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Refer>
> Aug 26 10:26:32.691653 amc-k8sdevsl01-worker-lx13 kernel: RIP: 0010:refcount_warn_saturate+0xa6/0xf0
> Aug 26 10:26:32.691720 amc-k8sdevsl01-worker-lx13 kernel: Code: 05 3c 1d 40 01 01 e8 81 46 38 00 0f 0b c3 80 3d 2a 1d 40 01 00 75>
> Aug 26 10:26:32.691747 amc-k8sdevsl01-worker-lx13 kernel: RSP: 0018:ffffa3a0c3627938 EFLAGS: 00010282
> Aug 26 10:26:32.692385 amc-k8sdevsl01-worker-lx13 kernel: RAX: 0000000000000000 RBX: ffff8c011b14fa00 RCX: 0000000000000027
> Aug 26 10:26:32.692422 amc-k8sdevsl01-worker-lx13 kernel: RDX: 0000000000000027 RSI: 00000000ffffdfff RDI: ffff8c045d918b08
> Aug 26 10:26:32.692446 amc-k8sdevsl01-worker-lx13 kernel: RBP: ffff8c011b14fa00 R08: ffff8c045d918b00 R09: ffffa3a0c3627750
> Aug 26 10:26:32.693526 amc-k8sdevsl01-worker-lx13 kernel: R10: 0000000000000001 R11: 0000000000000001 R12: ffff8c011b14fa30
> Aug 26 10:26:32.693584 amc-k8sdevsl01-worker-lx13 kernel: R13: 0000000000000002 R14: ffff8bfda3b43180 R15: ffff8c00cddb3a00
> Aug 26 10:26:32.693615 amc-k8sdevsl01-worker-lx13 kernel: FS:  00007ff7a2331b38(0000) GS:ffff8c045d900000(0000) knlGS:00000000000>
> Aug 26 10:26:32.693649 amc-k8sdevsl01-worker-lx13 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Aug 26 10:26:32.694304 amc-k8sdevsl01-worker-lx13 kernel: CR2: 00007ff79ac17a28 CR3: 00000001ee34e003 CR4: 00000000007706e0
> Aug 26 10:26:32.694334 amc-k8sdevsl01-worker-lx13 kernel: PKRU: 55555554
> Aug 26 10:26:32.694351 amc-k8sdevsl01-worker-lx13 kernel: Call Trace:
> Aug 26 10:26:32.694370 amc-k8sdevsl01-worker-lx13 kernel:  nf_queue_entry_release_refs+0x82/0xa0

Is that sock_put()?

If so, I don't understand this backtrace.  When refcount_t debugging is
on, sock_hold() would also generate a backtrace in case we try to
incrase refcount on a socket that already has a zero refcount.

So, looks like something else decremented sk refcount while packet
was queued.  No idea how that could happen.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-30 20:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-30 19:38 System crash on Ubuntu 18, in netlink code when using iptables / netfilter Yuri Lipnesh
2020-11-30 19:58 ` Florian Westphal
2020-12-03 17:00   ` Yuri Lipnesh
2021-08-30 15:43     ` System crash in netfilter 5.10.25 Yuri Lipnesh
2021-08-30 20:51       ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).