netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: PROBLEM: virtio_net LRO kernel panics
       [not found] <CACFia2dwacaVVYD+1uG=CDGaJqdCOSBvZ5FcXp04caecaWAY3w@mail.gmail.com>
@ 2021-07-30 11:42 ` Michael S. Tsirkin
  2021-07-30 17:04   ` Ivan
  0 siblings, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2021-07-30 11:42 UTC (permalink / raw)
  To: Ivan
  Cc: Jason Wang, Willem de Bruijn, David S. Miller, Tonghao Zhang,
	virtualization, netdev, Eric Dumazet, Jakub Kicinski

On Thu, Jul 22, 2021 at 06:27:18PM -0500, Ivan wrote:
> Dear Sir,
> 
> I've been plagued with kernel panics recently. The problem is easily
> reproducible on any virtual machine that uses the virtio-net driver
> from stock Linux kernel. Simply isuse this command:
> 
> echo 1 > /proc/sys/net/ipv4/ip_forward
> ...and the kernel panics.
> 
> Is there any way we can possibly fix this?
> 
> kernel: ------------[ cut here ]------------
> kernel: netdevice: eth0: failed to disable LRO!
> kernel: WARNING: CPU: 1 PID: 424 at net/core/dev.c:1768
> dev_disable_lro+0x108/0x150
> kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usbhid
> atkbd libps2 ahci libahci virtio_net ohci_pci net_failover failover
> i8042 serio lpc_ich mfd_core libata ohci_hcd ehci_pci ehci_hcd usbcore
> rng_core i2c_piix4 i2c_core virtio_pci usb_common
> virtio_pci_modern_dev virtio_ring virtio loop unix
> kernel: CPU: 1 PID: 424 Comm: bash Not tainted 5.13.4-gnu.4-NuMini #1
> kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> VirtualBox 12/01/2006
> kernel: RIP: 0010:dev_disable_lro+0x108/0x150
> kernel: Code: ae 88 74 14 be 25 00 00 00 48 89 df e8 f1 54 ed ff 48 85
> c0 48 0f 44 eb 4c 89 e2 48 89 ee 48 c7 c7 00 c6 ae 88 e8 7a 76 0c 00
> <0f> 0b e9 2d ff ff ff 80 3d e8 70 97 00 00 49 c7 c4 73 bb ae 88 75
> kernel: RSP: 0018:ffffb596c0237d80 EFLAGS: 00010282
> kernel: RAX: 0000000000000000 RBX: ffff9af9c1835000 RCX: ffff9af9fed17538
> kernel: RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9af9fed17530
> kernel: RBP: ffff9af9c1835000 R08: ffffffff88c96ac8 R09: 0000000000004ffb
> kernel: R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffffffff88ac7c3d
> kernel: R13: 0000000000000000 R14: ffffffff88cb2748 R15: ffff9af9c12166c8
> kernel: FS:  00007fd4911b8740(0000) GS:ffff9af9fed00000(0000)
> knlGS:0000000000000000
> kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> kernel: CR2: 0000000000532008 CR3: 000000000115c000 CR4: 00000000000406e0
> kernel: Call Trace:
> kernel:  devinet_sysctl_forward+0x1ac/0x1e0
> kernel:  proc_sys_call_handler+0x127/0x230
> kernel:  new_sync_write+0x114/0x1a0
> kernel:  vfs_write+0x18c/0x220
> kernel:  ksys_write+0x5a/0xd0
> kernel:  do_syscall_64+0x45/0x80
> kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
> kernel: RIP: 0033:0x7fd4912b79b3
> kernel: Code: 8b 15 b9 74 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb
> b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
> <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> kernel: RSP: 002b:00007ffe96fdd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4912b79b3
> kernel: RDX: 0000000000000002 RSI: 0000000000536810 RDI: 0000000000000001
> kernel: RBP: 0000000000536810 R08: 000000000000000a R09: 0000000000000000
> kernel: R10: 00007fd49134f040 R11: 0000000000000246 R12: 0000000000000002
> kernel: R13: 00007fd4913906c0 R14: 00007fd49138c520 R15: 00007fd49138b920
> kernel: ---[ end trace ee7985b10570603d ]---
> kernel: ------------[ cut here ]------------

So the warning is easy to reproduce.
On qemu/kvm just set ctrl_guest_offloads=off for the device.

The panic does not seem to trigger for me and you did not provide
any data about it.  What happens? Does guest just freeze?

I am guessing the issue is that dev_disable_lro does not report the
return status and inet_forward_change assumes it's successful.  We then
end up with LRO packets in unexpected places.

Cc netdev and a bunch of people who might have a better idea.

-- 
MST


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PROBLEM: virtio_net LRO kernel panics
  2021-07-30 11:42 ` PROBLEM: virtio_net LRO kernel panics Michael S. Tsirkin
@ 2021-07-30 17:04   ` Ivan
  2021-07-31 20:53     ` Michael S. Tsirkin
  2021-08-02  4:35     ` Jason Wang
  0 siblings, 2 replies; 6+ messages in thread
From: Ivan @ 2021-07-30 17:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Willem de Bruijn, David S. Miller, Tonghao Zhang,
	virtualization, netdev, Eric Dumazet, Jakub Kicinski, Ivan

On Fri, Jul 30, 2021 at 6:42 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Jul 22, 2021 at 06:27:18PM -0500, Ivan wrote:
> > Dear Sir,
> >
> > I've been plagued with kernel panics recently. The problem is easily
> > reproducible on any virtual machine that uses the virtio-net driver
> > from stock Linux kernel. Simply isuse this command:
> >
> > echo 1 > /proc/sys/net/ipv4/ip_forward
> > ...and the kernel panics.
> >
> > Is there any way we can possibly fix this?
> >
> > kernel: ------------[ cut here ]------------
> > kernel: netdevice: eth0: failed to disable LRO!
> > kernel: WARNING: CPU: 1 PID: 424 at net/core/dev.c:1768
> > dev_disable_lro+0x108/0x150
> > kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usbhid
> > atkbd libps2 ahci libahci virtio_net ohci_pci net_failover failover
> > i8042 serio lpc_ich mfd_core libata ohci_hcd ehci_pci ehci_hcd usbcore
> > rng_core i2c_piix4 i2c_core virtio_pci usb_common
> > virtio_pci_modern_dev virtio_ring virtio loop unix
> > kernel: CPU: 1 PID: 424 Comm: bash Not tainted 5.13.4-gnu.4-NuMini #1
> > kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> > VirtualBox 12/01/2006
> > kernel: RIP: 0010:dev_disable_lro+0x108/0x150
> > kernel: Code: ae 88 74 14 be 25 00 00 00 48 89 df e8 f1 54 ed ff 48 85
> > c0 48 0f 44 eb 4c 89 e2 48 89 ee 48 c7 c7 00 c6 ae 88 e8 7a 76 0c 00
> > <0f> 0b e9 2d ff ff ff 80 3d e8 70 97 00 00 49 c7 c4 73 bb ae 88 75
> > kernel: RSP: 0018:ffffb596c0237d80 EFLAGS: 00010282
> > kernel: RAX: 0000000000000000 RBX: ffff9af9c1835000 RCX: ffff9af9fed17538
> > kernel: RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9af9fed17530
> > kernel: RBP: ffff9af9c1835000 R08: ffffffff88c96ac8 R09: 0000000000004ffb
> > kernel: R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffffffff88ac7c3d
> > kernel: R13: 0000000000000000 R14: ffffffff88cb2748 R15: ffff9af9c12166c8
> > kernel: FS:  00007fd4911b8740(0000) GS:ffff9af9fed00000(0000)
> > knlGS:0000000000000000
> > kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > kernel: CR2: 0000000000532008 CR3: 000000000115c000 CR4: 00000000000406e0
> > kernel: Call Trace:
> > kernel:  devinet_sysctl_forward+0x1ac/0x1e0
> > kernel:  proc_sys_call_handler+0x127/0x230
> > kernel:  new_sync_write+0x114/0x1a0
> > kernel:  vfs_write+0x18c/0x220
> > kernel:  ksys_write+0x5a/0xd0
> > kernel:  do_syscall_64+0x45/0x80
> > kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > kernel: RIP: 0033:0x7fd4912b79b3
> > kernel: Code: 8b 15 b9 74 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb
> > b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
> > <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> > kernel: RSP: 002b:00007ffe96fdd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4912b79b3
> > kernel: RDX: 0000000000000002 RSI: 0000000000536810 RDI: 0000000000000001
> > kernel: RBP: 0000000000536810 R08: 000000000000000a R09: 0000000000000000
> > kernel: R10: 00007fd49134f040 R11: 0000000000000246 R12: 0000000000000002
> > kernel: R13: 00007fd4913906c0 R14: 00007fd49138c520 R15: 00007fd49138b920
> > kernel: ---[ end trace ee7985b10570603d ]---
> > kernel: ------------[ cut here ]------------
>
> So the warning is easy to reproduce.
> On qemu/kvm just set ctrl_guest_offloads=off for the device.

I have no control over the settings of the host.
I have full control over the guest.

> The panic does not seem to trigger for me and you did not provide
> any data about it.  What happens? Does guest just freeze?

I'm not sure if I am misusing the word "panic". (Appologies, not a programer)
No, the guest does not freeze, just, the moment I issue the command...
  echo 1 > /proc/sys/net/ipv4/ip_forward
... and I see the "--[ cut here ]--" message appear in the syslog.
Shortly thereafter my ssh session to that host dies.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PROBLEM: virtio_net LRO kernel panics
  2021-07-30 17:04   ` Ivan
@ 2021-07-31 20:53     ` Michael S. Tsirkin
  2021-07-31 23:52       ` Ivan
  2021-08-02  4:35     ` Jason Wang
  1 sibling, 1 reply; 6+ messages in thread
From: Michael S. Tsirkin @ 2021-07-31 20:53 UTC (permalink / raw)
  To: Ivan
  Cc: Jason Wang, Willem de Bruijn, David S. Miller, Tonghao Zhang,
	virtualization, netdev, Eric Dumazet, Jakub Kicinski

On Fri, Jul 30, 2021 at 12:04:18PM -0500, Ivan wrote:
> On Fri, Jul 30, 2021 at 6:42 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Thu, Jul 22, 2021 at 06:27:18PM -0500, Ivan wrote:
> > > Dear Sir,
> > >
> > > I've been plagued with kernel panics recently. The problem is easily
> > > reproducible on any virtual machine that uses the virtio-net driver
> > > from stock Linux kernel. Simply isuse this command:
> > >
> > > echo 1 > /proc/sys/net/ipv4/ip_forward
> > > ...and the kernel panics.
> > >
> > > Is there any way we can possibly fix this?
> > >
> > > kernel: ------------[ cut here ]------------
> > > kernel: netdevice: eth0: failed to disable LRO!
> > > kernel: WARNING: CPU: 1 PID: 424 at net/core/dev.c:1768
> > > dev_disable_lro+0x108/0x150
> > > kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usbhid
> > > atkbd libps2 ahci libahci virtio_net ohci_pci net_failover failover
> > > i8042 serio lpc_ich mfd_core libata ohci_hcd ehci_pci ehci_hcd usbcore
> > > rng_core i2c_piix4 i2c_core virtio_pci usb_common
> > > virtio_pci_modern_dev virtio_ring virtio loop unix
> > > kernel: CPU: 1 PID: 424 Comm: bash Not tainted 5.13.4-gnu.4-NuMini #1
> > > kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> > > VirtualBox 12/01/2006
> > > kernel: RIP: 0010:dev_disable_lro+0x108/0x150
> > > kernel: Code: ae 88 74 14 be 25 00 00 00 48 89 df e8 f1 54 ed ff 48 85
> > > c0 48 0f 44 eb 4c 89 e2 48 89 ee 48 c7 c7 00 c6 ae 88 e8 7a 76 0c 00
> > > <0f> 0b e9 2d ff ff ff 80 3d e8 70 97 00 00 49 c7 c4 73 bb ae 88 75
> > > kernel: RSP: 0018:ffffb596c0237d80 EFLAGS: 00010282
> > > kernel: RAX: 0000000000000000 RBX: ffff9af9c1835000 RCX: ffff9af9fed17538
> > > kernel: RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9af9fed17530
> > > kernel: RBP: ffff9af9c1835000 R08: ffffffff88c96ac8 R09: 0000000000004ffb
> > > kernel: R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffffffff88ac7c3d
> > > kernel: R13: 0000000000000000 R14: ffffffff88cb2748 R15: ffff9af9c12166c8
> > > kernel: FS:  00007fd4911b8740(0000) GS:ffff9af9fed00000(0000)
> > > knlGS:0000000000000000
> > > kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > kernel: CR2: 0000000000532008 CR3: 000000000115c000 CR4: 00000000000406e0
> > > kernel: Call Trace:
> > > kernel:  devinet_sysctl_forward+0x1ac/0x1e0
> > > kernel:  proc_sys_call_handler+0x127/0x230
> > > kernel:  new_sync_write+0x114/0x1a0
> > > kernel:  vfs_write+0x18c/0x220
> > > kernel:  ksys_write+0x5a/0xd0
> > > kernel:  do_syscall_64+0x45/0x80
> > > kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > kernel: RIP: 0033:0x7fd4912b79b3
> > > kernel: Code: 8b 15 b9 74 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb
> > > b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
> > > <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> > > kernel: RSP: 002b:00007ffe96fdd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > > kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4912b79b3
> > > kernel: RDX: 0000000000000002 RSI: 0000000000536810 RDI: 0000000000000001
> > > kernel: RBP: 0000000000536810 R08: 000000000000000a R09: 0000000000000000
> > > kernel: R10: 00007fd49134f040 R11: 0000000000000246 R12: 0000000000000002
> > > kernel: R13: 00007fd4913906c0 R14: 00007fd49138c520 R15: 00007fd49138b920
> > > kernel: ---[ end trace ee7985b10570603d ]---
> > > kernel: ------------[ cut here ]------------
> >
> > So the warning is easy to reproduce.
> > On qemu/kvm just set ctrl_guest_offloads=off for the device.
> 
> I have no control over the settings of the host.
> I have full control over the guest.
> 
> > The panic does not seem to trigger for me and you did not provide
> > any data about it.  What happens? Does guest just freeze?
> 
> I'm not sure if I am misusing the word "panic". (Appologies, not a programer)
> No, the guest does not freeze, just, the moment I issue the command...
>   echo 1 > /proc/sys/net/ipv4/ip_forward
> ... and I see the "--[ cut here ]--" message appear in the syslog.
> Shortly thereafter my ssh session to that host dies.

So the host or to the guest? 

-- 
MST


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PROBLEM: virtio_net LRO kernel panics
  2021-07-31 20:53     ` Michael S. Tsirkin
@ 2021-07-31 23:52       ` Ivan
  0 siblings, 0 replies; 6+ messages in thread
From: Ivan @ 2021-07-31 23:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Jason Wang, Willem de Bruijn, David S. Miller, Tonghao Zhang,
	virtualization, netdev, Eric Dumazet, Jakub Kicinski, Ivan

On Sat, Jul 31, 2021 at 3:53 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Jul 30, 2021 at 12:04:18PM -0500, Ivan wrote:
> > On Fri, Jul 30, 2021 at 6:42 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Thu, Jul 22, 2021 at 06:27:18PM -0500, Ivan wrote:
> > > > Dear Sir,
> > > >
> > > > I've been plagued with kernel panics recently. The problem is easily
> > > > reproducible on any virtual machine that uses the virtio-net driver
> > > > from stock Linux kernel. Simply isuse this command:
> > > >
> > > > echo 1 > /proc/sys/net/ipv4/ip_forward
> > > > ...and the kernel panics.
> > > >
> > > > Is there any way we can possibly fix this?
> > > >
> > > > kernel: ------------[ cut here ]------------
> > > > kernel: netdevice: eth0: failed to disable LRO!
> > > > kernel: WARNING: CPU: 1 PID: 424 at net/core/dev.c:1768
> > > > dev_disable_lro+0x108/0x150
> > > > kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usbhid
> > > > atkbd libps2 ahci libahci virtio_net ohci_pci net_failover failover
> > > > i8042 serio lpc_ich mfd_core libata ohci_hcd ehci_pci ehci_hcd usbcore
> > > > rng_core i2c_piix4 i2c_core virtio_pci usb_common
> > > > virtio_pci_modern_dev virtio_ring virtio loop unix
> > > > kernel: CPU: 1 PID: 424 Comm: bash Not tainted 5.13.4-gnu.4-NuMini #1
> > > > kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> > > > VirtualBox 12/01/2006
> > > > kernel: RIP: 0010:dev_disable_lro+0x108/0x150
> > > > kernel: Code: ae 88 74 14 be 25 00 00 00 48 89 df e8 f1 54 ed ff 48 85
> > > > c0 48 0f 44 eb 4c 89 e2 48 89 ee 48 c7 c7 00 c6 ae 88 e8 7a 76 0c 00
> > > > <0f> 0b e9 2d ff ff ff 80 3d e8 70 97 00 00 49 c7 c4 73 bb ae 88 75
> > > > kernel: RSP: 0018:ffffb596c0237d80 EFLAGS: 00010282
> > > > kernel: RAX: 0000000000000000 RBX: ffff9af9c1835000 RCX: ffff9af9fed17538
> > > > kernel: RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9af9fed17530
> > > > kernel: RBP: ffff9af9c1835000 R08: ffffffff88c96ac8 R09: 0000000000004ffb
> > > > kernel: R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffffffff88ac7c3d
> > > > kernel: R13: 0000000000000000 R14: ffffffff88cb2748 R15: ffff9af9c12166c8
> > > > kernel: FS:  00007fd4911b8740(0000) GS:ffff9af9fed00000(0000)
> > > > knlGS:0000000000000000
> > > > kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > kernel: CR2: 0000000000532008 CR3: 000000000115c000 CR4: 00000000000406e0
> > > > kernel: Call Trace:
> > > > kernel:  devinet_sysctl_forward+0x1ac/0x1e0
> > > > kernel:  proc_sys_call_handler+0x127/0x230
> > > > kernel:  new_sync_write+0x114/0x1a0
> > > > kernel:  vfs_write+0x18c/0x220
> > > > kernel:  ksys_write+0x5a/0xd0
> > > > kernel:  do_syscall_64+0x45/0x80
> > > > kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > kernel: RIP: 0033:0x7fd4912b79b3
> > > > kernel: Code: 8b 15 b9 74 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb
> > > > b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
> > > > <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> > > > kernel: RSP: 002b:00007ffe96fdd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > > > kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4912b79b3
> > > > kernel: RDX: 0000000000000002 RSI: 0000000000536810 RDI: 0000000000000001
> > > > kernel: RBP: 0000000000536810 R08: 000000000000000a R09: 0000000000000000
> > > > kernel: R10: 00007fd49134f040 R11: 0000000000000246 R12: 0000000000000002
> > > > kernel: R13: 00007fd4913906c0 R14: 00007fd49138c520 R15: 00007fd49138b920
> > > > kernel: ---[ end trace ee7985b10570603d ]---
> > > > kernel: ------------[ cut here ]------------
> > >
> > > So the warning is easy to reproduce.
> > > On qemu/kvm just set ctrl_guest_offloads=off for the device.
> >
> > I have no control over the settings of the host.
> > I have full control over the guest.
> >
> > > The panic does not seem to trigger for me and you did not provide
> > > any data about it.  What happens? Does guest just freeze?
> >
> > I'm not sure if I am misusing the word "panic". (Appologies, not a programer)
> > No, the guest does not freeze, just, the moment I issue the command...
> >   echo 1 > /proc/sys/net/ipv4/ip_forward
> > ... and I see the "--[ cut here ]--" message appear in the syslog.
> > Shortly thereafter my ssh session to that host dies.
>
> So the host or to the guest?
Sorry!  The guest. (My bad)  This problem happens in the guest.
My ssh session to that guest dies shortly after I ussue that command.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PROBLEM: virtio_net LRO kernel panics
  2021-07-30 17:04   ` Ivan
  2021-07-31 20:53     ` Michael S. Tsirkin
@ 2021-08-02  4:35     ` Jason Wang
  2021-08-02 18:16       ` Ivan
  1 sibling, 1 reply; 6+ messages in thread
From: Jason Wang @ 2021-08-02  4:35 UTC (permalink / raw)
  To: Ivan, Michael S. Tsirkin
  Cc: Willem de Bruijn, David S. Miller, Tonghao Zhang, virtualization,
	netdev, Eric Dumazet, Jakub Kicinski


在 2021/7/31 上午1:04, Ivan 写道:
> On Fri, Jul 30, 2021 at 6:42 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Thu, Jul 22, 2021 at 06:27:18PM -0500, Ivan wrote:
>>> Dear Sir,
>>>
>>> I've been plagued with kernel panics recently. The problem is easily
>>> reproducible on any virtual machine that uses the virtio-net driver
>>> from stock Linux kernel. Simply isuse this command:
>>>
>>> echo 1 > /proc/sys/net/ipv4/ip_forward
>>> ...and the kernel panics.
>>>
>>> Is there any way we can possibly fix this?
>>>
>>> kernel: ------------[ cut here ]------------
>>> kernel: netdevice: eth0: failed to disable LRO!
>>> kernel: WARNING: CPU: 1 PID: 424 at net/core/dev.c:1768
>>> dev_disable_lro+0x108/0x150
>>> kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usbhid
>>> atkbd libps2 ahci libahci virtio_net ohci_pci net_failover failover
>>> i8042 serio lpc_ich mfd_core libata ohci_hcd ehci_pci ehci_hcd usbcore
>>> rng_core i2c_piix4 i2c_core virtio_pci usb_common
>>> virtio_pci_modern_dev virtio_ring virtio loop unix
>>> kernel: CPU: 1 PID: 424 Comm: bash Not tainted 5.13.4-gnu.4-NuMini #1
>>> kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
>>> VirtualBox 12/01/2006
>>> kernel: RIP: 0010:dev_disable_lro+0x108/0x150
>>> kernel: Code: ae 88 74 14 be 25 00 00 00 48 89 df e8 f1 54 ed ff 48 85
>>> c0 48 0f 44 eb 4c 89 e2 48 89 ee 48 c7 c7 00 c6 ae 88 e8 7a 76 0c 00
>>> <0f> 0b e9 2d ff ff ff 80 3d e8 70 97 00 00 49 c7 c4 73 bb ae 88 75
>>> kernel: RSP: 0018:ffffb596c0237d80 EFLAGS: 00010282
>>> kernel: RAX: 0000000000000000 RBX: ffff9af9c1835000 RCX: ffff9af9fed17538
>>> kernel: RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9af9fed17530
>>> kernel: RBP: ffff9af9c1835000 R08: ffffffff88c96ac8 R09: 0000000000004ffb
>>> kernel: R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffffffff88ac7c3d
>>> kernel: R13: 0000000000000000 R14: ffffffff88cb2748 R15: ffff9af9c12166c8
>>> kernel: FS:  00007fd4911b8740(0000) GS:ffff9af9fed00000(0000)
>>> knlGS:0000000000000000
>>> kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: CR2: 0000000000532008 CR3: 000000000115c000 CR4: 00000000000406e0
>>> kernel: Call Trace:
>>> kernel:  devinet_sysctl_forward+0x1ac/0x1e0
>>> kernel:  proc_sys_call_handler+0x127/0x230
>>> kernel:  new_sync_write+0x114/0x1a0
>>> kernel:  vfs_write+0x18c/0x220
>>> kernel:  ksys_write+0x5a/0xd0
>>> kernel:  do_syscall_64+0x45/0x80
>>> kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
>>> kernel: RIP: 0033:0x7fd4912b79b3
>>> kernel: Code: 8b 15 b9 74 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb
>>> b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
>>> <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
>>> kernel: RSP: 002b:00007ffe96fdd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>>> kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4912b79b3
>>> kernel: RDX: 0000000000000002 RSI: 0000000000536810 RDI: 0000000000000001
>>> kernel: RBP: 0000000000536810 R08: 000000000000000a R09: 0000000000000000
>>> kernel: R10: 00007fd49134f040 R11: 0000000000000246 R12: 0000000000000002
>>> kernel: R13: 00007fd4913906c0 R14: 00007fd49138c520 R15: 00007fd49138b920
>>> kernel: ---[ end trace ee7985b10570603d ]---
>>> kernel: ------------[ cut here ]------------
>> So the warning is easy to reproduce.
>> On qemu/kvm just set ctrl_guest_offloads=off for the device.
> I have no control over the settings of the host.
> I have full control over the guest.
>
>> The panic does not seem to trigger for me and you did not provide
>> any data about it.  What happens? Does guest just freeze?
> I'm not sure if I am misusing the word "panic". (Appologies, not a programer)
> No, the guest does not freeze, just, the moment I issue the command...
>    echo 1 > /proc/sys/net/ipv4/ip_forward
> ... and I see the "--[ cut here ]--" message appear in the syslog.
> Shortly thereafter my ssh session to that host dies.


Does it work before this commit?

commit a02e8964eaf9271a8a5fcc0c55bd13f933bafc56
Author: Willem de Bruijn <willemb@google.com>
Date:   Thu Dec 20 17:14:54 2018 -0500

     virtio-net: ethtool configurable LRO

Thanks


>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: PROBLEM: virtio_net LRO kernel panics
  2021-08-02  4:35     ` Jason Wang
@ 2021-08-02 18:16       ` Ivan
  0 siblings, 0 replies; 6+ messages in thread
From: Ivan @ 2021-08-02 18:16 UTC (permalink / raw)
  To: Jason Wang
  Cc: Michael S. Tsirkin, Willem de Bruijn, David S. Miller,
	Tonghao Zhang, virtualization, netdev, Eric Dumazet,
	Jakub Kicinski, Ivan

On Sun, Aug 1, 2021 at 11:35 PM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2021/7/31 上午1:04, Ivan 写道:
> > On Fri, Jul 30, 2021 at 6:42 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >> On Thu, Jul 22, 2021 at 06:27:18PM -0500, Ivan wrote:
> >>> Dear Sir,
> >>>
> >>> I've been plagued with kernel panics recently. The problem is easily
> >>> reproducible on any virtual machine that uses the virtio-net driver
> >>> from stock Linux kernel. Simply isuse this command:
> >>>
> >>> echo 1 > /proc/sys/net/ipv4/ip_forward
> >>> ...and the kernel panics.
> >>>
> >>> Is there any way we can possibly fix this?
> >>>
> >>> kernel: ------------[ cut here ]------------
> >>> kernel: netdevice: eth0: failed to disable LRO!
> >>> kernel: WARNING: CPU: 1 PID: 424 at net/core/dev.c:1768
> >>> dev_disable_lro+0x108/0x150
> >>> kernel: Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usbhid
> >>> atkbd libps2 ahci libahci virtio_net ohci_pci net_failover failover
> >>> i8042 serio lpc_ich mfd_core libata ohci_hcd ehci_pci ehci_hcd usbcore
> >>> rng_core i2c_piix4 i2c_core virtio_pci usb_common
> >>> virtio_pci_modern_dev virtio_ring virtio loop unix
> >>> kernel: CPU: 1 PID: 424 Comm: bash Not tainted 5.13.4-gnu.4-NuMini #1
> >>> kernel: Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
> >>> VirtualBox 12/01/2006
> >>> kernel: RIP: 0010:dev_disable_lro+0x108/0x150
> >>> kernel: Code: ae 88 74 14 be 25 00 00 00 48 89 df e8 f1 54 ed ff 48 85
> >>> c0 48 0f 44 eb 4c 89 e2 48 89 ee 48 c7 c7 00 c6 ae 88 e8 7a 76 0c 00
> >>> <0f> 0b e9 2d ff ff ff 80 3d e8 70 97 00 00 49 c7 c4 73 bb ae 88 75
> >>> kernel: RSP: 0018:ffffb596c0237d80 EFLAGS: 00010282
> >>> kernel: RAX: 0000000000000000 RBX: ffff9af9c1835000 RCX: ffff9af9fed17538
> >>> kernel: RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9af9fed17530
> >>> kernel: RBP: ffff9af9c1835000 R08: ffffffff88c96ac8 R09: 0000000000004ffb
> >>> kernel: R10: 00000000fffff000 R11: 3fffffffffffffff R12: ffffffff88ac7c3d
> >>> kernel: R13: 0000000000000000 R14: ffffffff88cb2748 R15: ffff9af9c12166c8
> >>> kernel: FS:  00007fd4911b8740(0000) GS:ffff9af9fed00000(0000)
> >>> knlGS:0000000000000000
> >>> kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> kernel: CR2: 0000000000532008 CR3: 000000000115c000 CR4: 00000000000406e0
> >>> kernel: Call Trace:
> >>> kernel:  devinet_sysctl_forward+0x1ac/0x1e0
> >>> kernel:  proc_sys_call_handler+0x127/0x230
> >>> kernel:  new_sync_write+0x114/0x1a0
> >>> kernel:  vfs_write+0x18c/0x220
> >>> kernel:  ksys_write+0x5a/0xd0
> >>> kernel:  do_syscall_64+0x45/0x80
> >>> kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >>> kernel: RIP: 0033:0x7fd4912b79b3
> >>> kernel: Code: 8b 15 b9 74 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb
> >>> b7 0f 1f 00 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05
> >>> <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
> >>> kernel: RSP: 002b:00007ffe96fdd858 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> >>> kernel: RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fd4912b79b3
> >>> kernel: RDX: 0000000000000002 RSI: 0000000000536810 RDI: 0000000000000001
> >>> kernel: RBP: 0000000000536810 R08: 000000000000000a R09: 0000000000000000
> >>> kernel: R10: 00007fd49134f040 R11: 0000000000000246 R12: 0000000000000002
> >>> kernel: R13: 00007fd4913906c0 R14: 00007fd49138c520 R15: 00007fd49138b920
> >>> kernel: ---[ end trace ee7985b10570603d ]---
> >>> kernel: ------------[ cut here ]------------
> >> So the warning is easy to reproduce.
> >> On qemu/kvm just set ctrl_guest_offloads=off for the device.
> > I have no control over the settings of the host.
> > I have full control over the guest.
> >
> >> The panic does not seem to trigger for me and you did not provide
> >> any data about it.  What happens? Does guest just freeze?
> > I'm not sure if I am misusing the word "panic". (Appologies, not a programer)
> > No, the guest does not freeze, just, the moment I issue the command...
> >    echo 1 > /proc/sys/net/ipv4/ip_forward
> > ... and I see the "--[ cut here ]--" message appear in the syslog.
> > Shortly thereafter my ssh session to that host dies.
>
>
> Does it work before this commit?
>
> commit a02e8964eaf9271a8a5fcc0c55bd13f933bafc56
> Author: Willem de Bruijn <willemb@google.com>
> Date:   Thu Dec 20 17:14:54 2018 -0500
>
>      virtio-net: ethtool configurable LRO

Yes.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-08-02 18:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CACFia2dwacaVVYD+1uG=CDGaJqdCOSBvZ5FcXp04caecaWAY3w@mail.gmail.com>
2021-07-30 11:42 ` PROBLEM: virtio_net LRO kernel panics Michael S. Tsirkin
2021-07-30 17:04   ` Ivan
2021-07-31 20:53     ` Michael S. Tsirkin
2021-07-31 23:52       ` Ivan
2021-08-02  4:35     ` Jason Wang
2021-08-02 18:16       ` Ivan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).