All of lore.kernel.org
 help / color / mirror / Atom feed
* "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
@ 2016-12-12 23:33 Theodore Ts'o
  2016-12-13  2:28 ` Michael S. Tsirkin
  2016-12-13 17:46 ` Wei Xu
  0 siblings, 2 replies; 10+ messages in thread
From: Theodore Ts'o @ 2016-12-12 23:33 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, mst, nhorman, davem

Hi,

I was doing a last minute regression test of the ext4 tree before
sending a pull request to Linus, which I do using gce-xfstests[1], and
I found that using networking was broken on GCE on linux-next.  I was
using next-20161209, and after bisecting things, I narrowed down the
commit which causing things to break to commit 449000102901:
"virtio-net: enable multiqueue by default".  Reverting this commit on
top of next-20161209 fixed the problem.

[1] http://thunk.org/gce-xfstests

You can reproduce the problem for building the kernel for Google
Compute Engine --- I use a config such as this [2], and then try to
boot a kernel on a VM.  The way I do this involves booting a test
appliance and then kexec'ing into the kernel to be tested[3], using a
2cpu configuration.  (GCE machine type: n1-standard-2)

[2] https://git.kernel.org/cgit/fs/ext2/xfstests-bld.git/tree/kernel-configs/ext4-x86_64-config-4.9
[3] https://github.com/tytso/xfstests-bld/blob/master/Documentation/gce-xfstests.md

You can then take a look at serial console using a command such as
"gcloud compute instances get-serial-port-output <instance-name>", and
you will get something like this (see attached).  The important bit is
that the dhclient command is completely failing to be able to get a
response from the network, from which I deduce that apparently that
either networking send or receive or both seem to be badly affected by
the commit in question.

Please let me know if there's anything I can do to help you debug this
further.

Cheers,

						- Ted

Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] Linux version 4.9.0-rc8-ext4-06387-g03e5cbd (tytso@tytso-ssd) (gcc version 4.9.2 (Debian 4.9.2-10) ) #9 SMP Mon Dec 12 04:50:16 UTC 2016
Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] Command line: root=/dev/sda1 ro console=ttyS0,38400n8 elevator=noop console=ttyS0  fstestcfg=4k fstestset=-g,quick fstestexc= fstestopt=aex fstesttyp=ext4 fstestapi=1.3
Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Load Kernel Modules.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Apply Kernel Variables...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounting Configuration File System...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounting FUSE Control File System...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounted FUSE Control File System.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounted Configuration File System.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Apply Kernel Variables.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Create Static Device Nodes in /dev.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting udev Kernel Device Manager...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Kernel Device Manager.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Coldplug all Devices.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting udev Wait for Complete Device Initialization...
Dec 11 23:53:20 xfstests-201612120451 systemd-fsck[1659]: xfstests-root: clean, 56268/655360 files, 357439/2620928 blocks
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started File System Check on Root Device.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Remount Root and Kernel File Systems...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Remount Root and Kernel File Systems.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Various fixups to make systemd work better on Debian.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Load/Save Random Seed...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Local File Systems (Pre).
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Local File Systems (Pre).
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Load/Save Random Seed.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Wait for Complete Device Initialization.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Activation of LVM2 logical volumes...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Copy rules generated while the root was ro...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS0.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS1.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Copy rules generated while the root was ro.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS2.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS3.
Dec 11 23:53:20 xfstests-201612120451 systemd-udevd[2568]: could not open moddep file '/lib/modules/4.9.0-rc8-ext4-06387-g03e5cbd/modules.dep.bin'
Dec 11 23:53:20 xfstests-201612120451 lvm[2579]: No volume groups found
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Activation of LVM2 logical volumes.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Encrypted Volumes.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Encrypted Volumes.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Activation of LVM2 logical volumes...
Dec 11 23:53:20 xfstests-201612120451 lvm[2625]: No volume groups found
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Activation of LVM2 logical volumes.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
Dec 11 23:53:20 xfstests-201612120451 lvm[2627]: No volume groups found
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Local File Systems.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Local File Systems.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Remote File Systems.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Remote File Systems.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Create Volatile Files and Directories...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting LSB: Generate ssh host keys if they do not exist...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting LSB: Raise network interfaces....
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Trigger Flushing of Journal to Persistent Storage.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Create Volatile Files and Directories.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started LSB: Generate ssh host keys if they do not exist.
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Update UTMP about System Boot/Shutdown...
Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Update UTMP about System Boot/Shutdown.
Dec 11 23:53:20 xfstests-201612120451 dhclient: Internet Systems Consortium DHCP Client 4.3.1
Dec 11 23:53:20 xfstests-201612120451 dhclient: Copyright 2004-2014 Internet Systems Consortium.
Dec 11 23:53:20 xfstests-201612120451 dhclient: All rights reserved.
Dec 11 23:53:20 xfstests-201612120451 dhclient: For info, please visit https://www.isc.org/software/dhcp/
Dec 11 23:53:20 xfstests-201612120451 dhclient: 
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Configuring network interfaces...Internet Systems Consortium DHCP Client 4.3.1
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Copyright 2004-2014 Internet Systems Consortium.
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: All rights reserved.
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: For info, please visit https://www.isc.org/software/dhcp/
Dec 11 23:53:20 xfstests-201612120451 dhclient: Listening on LPF/eth0/42:01:0a:f0:00:03
Dec 11 23:53:20 xfstests-201612120451 dhclient: Sending on   LPF/eth0/42:01:0a:f0:00:03
Dec 11 23:53:20 xfstests-201612120451 dhclient: Sending on   Socket/fallback
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Listening on LPF/eth0/42:01:0a:f0:00:03
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Sending on   LPF/eth0/42:01:0a:f0:00:03
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Sending on   Socket/fallback
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCP[^[[32m  OK  ^[[0m] DISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 17
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 17
Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 15
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 15
Dec 11 23:53:20 xfstests-201612120451 dhclient: No DHCPOFFERS received.
Dec 11 23:53:20 xfstests-201612120451 dhclient: Trying recorded lease 10.240.0.3
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: No DHCPOFFERS received.
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Trying recorded lease 10.240.0.3
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: connect: Network is unreachable
Dec 11 23:53:20 xfstests-201612120451 logger: /etc/dhcp/dhclient-exit-hooks returned non-zero exit status 2
Dec 11 23:53:20 xfstests-201612120451 dhclient: bound: renewal in 38598 seconds.
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: bound: renewal in 38598 seconds.
Dec 11 23:53:20 xfstests-201612120451 networking[2633]: done.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-12 23:33 "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE Theodore Ts'o
@ 2016-12-13  2:28 ` Michael S. Tsirkin
  2016-12-13  3:12   ` Theodore Ts'o
  2016-12-13 17:46 ` Wei Xu
  1 sibling, 1 reply; 10+ messages in thread
From: Michael S. Tsirkin @ 2016-12-13  2:28 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: jasowang, netdev, nhorman, davem

On Mon, Dec 12, 2016 at 06:33:43PM -0500, Theodore Ts'o wrote:
> Hi,
> 
> I was doing a last minute regression test of the ext4 tree before
> sending a pull request to Linus, which I do using gce-xfstests[1], and
> I found that using networking was broken on GCE on linux-next.  I was
> using next-20161209, and after bisecting things, I narrowed down the
> commit which causing things to break to commit 449000102901:
> "virtio-net: enable multiqueue by default".  Reverting this commit on
> top of next-20161209 fixed the problem.
> 
> [1] http://thunk.org/gce-xfstests
> 
> You can reproduce the problem for building the kernel for Google
> Compute Engine --- I use a config such as this [2], and then try to
> boot a kernel on a VM.  The way I do this involves booting a test
> appliance and then kexec'ing into the kernel to be tested[3], using a
> 2cpu configuration.  (GCE machine type: n1-standard-2)
> 
> [2] https://git.kernel.org/cgit/fs/ext2/xfstests-bld.git/tree/kernel-configs/ext4-x86_64-config-4.9
> [3] https://github.com/tytso/xfstests-bld/blob/master/Documentation/gce-xfstests.md
> 
> You can then take a look at serial console using a command such as
> "gcloud compute instances get-serial-port-output <instance-name>", and
> you will get something like this (see attached).  The important bit is
> that the dhclient command is completely failing to be able to get a
> response from the network, from which I deduce that apparently that
> either networking send or receive or both seem to be badly affected by
> the commit in question.
> 
> Please let me know if there's anything I can do to help you debug this
> further.
> 
> Cheers,
> 
> 						- Ted

That's unfortunate, of course. It could be a hypervisor or
a guest kernel bug. ideas:
- does host have mq capability? how many queues?
- how about # of msix vectors?
- after you send something on tx queues,
  are interrupts arriving on rx queues?
- is problem rx or tx?
  set ip and arp manually and send a packet to known MAC,
  does it get there?

> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] Linux version 4.9.0-rc8-ext4-06387-g03e5cbd (tytso@tytso-ssd) (gcc version 4.9.2 (Debian 4.9.2-10) ) #9 SMP Mon Dec 12 04:50:16 UTC 2016
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] Command line: root=/dev/sda1 ro console=ttyS0,38400n8 elevator=noop console=ttyS0  fstestcfg=4k fstestset=-g,quick fstestexc= fstestopt=aex fstesttyp=ext4 fstestapi=1.3
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Load Kernel Modules.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Apply Kernel Variables...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounting Configuration File System...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounting FUSE Control File System...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounted FUSE Control File System.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounted Configuration File System.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Apply Kernel Variables.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Create Static Device Nodes in /dev.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting udev Kernel Device Manager...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Kernel Device Manager.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Coldplug all Devices.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting udev Wait for Complete Device Initialization...
> Dec 11 23:53:20 xfstests-201612120451 systemd-fsck[1659]: xfstests-root: clean, 56268/655360 files, 357439/2620928 blocks
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started File System Check on Root Device.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Remount Root and Kernel File Systems...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Remount Root and Kernel File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Various fixups to make systemd work better on Debian.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Load/Save Random Seed...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Local File Systems (Pre).
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Local File Systems (Pre).
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Load/Save Random Seed.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Wait for Complete Device Initialization.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Activation of LVM2 logical volumes...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Copy rules generated while the root was ro...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS0.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS1.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Copy rules generated while the root was ro.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS2.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS3.
> Dec 11 23:53:20 xfstests-201612120451 systemd-udevd[2568]: could not open moddep file '/lib/modules/4.9.0-rc8-ext4-06387-g03e5cbd/modules.dep.bin'
> Dec 11 23:53:20 xfstests-201612120451 lvm[2579]: No volume groups found
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Activation of LVM2 logical volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Encrypted Volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Encrypted Volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Activation of LVM2 logical volumes...
> Dec 11 23:53:20 xfstests-201612120451 lvm[2625]: No volume groups found
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Activation of LVM2 logical volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
> Dec 11 23:53:20 xfstests-201612120451 lvm[2627]: No volume groups found
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Local File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Local File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Remote File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Remote File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Create Volatile Files and Directories...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting LSB: Generate ssh host keys if they do not exist...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting LSB: Raise network interfaces....
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Trigger Flushing of Journal to Persistent Storage.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Create Volatile Files and Directories.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started LSB: Generate ssh host keys if they do not exist.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Update UTMP about System Boot/Shutdown...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Update UTMP about System Boot/Shutdown.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Internet Systems Consortium DHCP Client 4.3.1
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Copyright 2004-2014 Internet Systems Consortium.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: All rights reserved.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: For info, please visit https://www.isc.org/software/dhcp/
> Dec 11 23:53:20 xfstests-201612120451 dhclient: 
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Configuring network interfaces...Internet Systems Consortium DHCP Client 4.3.1
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Copyright 2004-2014 Internet Systems Consortium.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: All rights reserved.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: For info, please visit https://www.isc.org/software/dhcp/
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Listening on LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Sending on   LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Sending on   Socket/fallback
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Listening on LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Sending on   LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Sending on   Socket/fallback
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCP[^[[32m  OK  ^[[0m] DISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 17
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 17
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 15
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 15
> Dec 11 23:53:20 xfstests-201612120451 dhclient: No DHCPOFFERS received.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Trying recorded lease 10.240.0.3
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: No DHCPOFFERS received.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Trying recorded lease 10.240.0.3
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: connect: Network is unreachable
> Dec 11 23:53:20 xfstests-201612120451 logger: /etc/dhcp/dhclient-exit-hooks returned non-zero exit status 2
> Dec 11 23:53:20 xfstests-201612120451 dhclient: bound: renewal in 38598 seconds.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: bound: renewal in 38598 seconds.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: done.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-13  2:28 ` Michael S. Tsirkin
@ 2016-12-13  3:12   ` Theodore Ts'o
  2016-12-13  3:30     ` Michael S. Tsirkin
  2016-12-13  3:43     ` Jason Wang
  0 siblings, 2 replies; 10+ messages in thread
From: Theodore Ts'o @ 2016-12-13  3:12 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: jasowang, netdev, nhorman, davem

On Tue, Dec 13, 2016 at 04:28:17AM +0200, Michael S. Tsirkin wrote:
> 
> That's unfortunate, of course. It could be a hypervisor or
> a guest kernel bug. ideas:
> - does host have mq capability? how many queues?
> - how about # of msix vectors?
> - after you send something on tx queues,
>   are interrupts arriving on rx queues?
> - is problem rx or tx?
>   set ip and arp manually and send a packet to known MAC,
>   does it get there?

Sorry, I don't know how to debug virtio-net.  Given that it's in a
cloud environment, I also can't set ip addresses manually, since ip
addresses are set manually.

If you can send me a patch, I'm happy to apply it and send you back
results.

I can say that I've had _zero_ problems using pretty much any kernel
from 3.10 to 4.9 using Google Compute Engine.  The commit I referenced
caused things to stop working.  So in terms of regression, this is
definitely a regression, and it's definitely caused by commit
449000102901.  Even if it is a hypervisor "bug", I'm pretty sure I
know what Linus will say if I ask him to revert it.  Linux kernels are
expected to work around hardware bugs, and breaking users just because
hardware is "broken" by some definition is generally not considered
friendly, especially when has been working for years and years before
some commit "fixed" things.

I would very much like to work with you to fix it, but I will need
your help, since virtio-net doesn't seem to print any informational
during the boot sequence, and I don't know how the best way to debug
it.

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-13  3:12   ` Theodore Ts'o
@ 2016-12-13  3:30     ` Michael S. Tsirkin
  2016-12-13  3:43     ` Jason Wang
  1 sibling, 0 replies; 10+ messages in thread
From: Michael S. Tsirkin @ 2016-12-13  3:30 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: jasowang, netdev, nhorman, davem

On Mon, Dec 12, 2016 at 10:12:43PM -0500, Theodore Ts'o wrote:
> On Tue, Dec 13, 2016 at 04:28:17AM +0200, Michael S. Tsirkin wrote:
> > 
> > That's unfortunate, of course. It could be a hypervisor or
> > a guest kernel bug. ideas:
> > - does host have mq capability? how many queues?
> > - how about # of msix vectors?
> > - after you send something on tx queues,
> >   are interrupts arriving on rx queues?
> > - is problem rx or tx?
> >   set ip and arp manually and send a packet to known MAC,
> >   does it get there?
> 
> Sorry, I don't know how to debug virtio-net.  Given that it's in a
> cloud environment, I also can't set ip addresses manually, since ip
> addresses are set manually.

OK, but you can send raw ethernet frames preseumably?


> If you can send me a patch, I'm happy to apply it and send you back
> results.

Let's start with collecting stats from sysfs for this device.
pls get features bitmap from there,
pls get /proc/interrupts mappings,
and pls use lspci to dump pci config.


> I can say that I've had _zero_ problems using pretty much any kernel
> from 3.10 to 4.9 using Google Compute Engine.  The commit I referenced
> caused things to stop working.  So in terms of regression, this is
> definitely a regression, and it's definitely caused by commit
> 449000102901.  Even if it is a hypervisor "bug", I'm pretty sure I
> know what Linus will say if I ask him to revert it.  Linux kernels are
> expected to work around hardware bugs, and breaking users just because
> hardware is "broken" by some definition is generally not considered
> friendly, especially when has been working for years and years before
> some commit "fixed" things.

I'm open to limiting new features to virtio 1 mode just to
avoid the hassle of dealing with legacy hypervisors.
But let's not argue about it until we know the root cause.

> 
> I would very much like to work with you to fix it, but I will need
> your help, since virtio-net doesn't seem to print any informational
> during the boot sequence, and I don't know how the best way to debug
> it.
> 
> Cheers,
> 
> 						- Ted


Let's start with debugging it like any PCI NIC.


-- 
MST

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-13  3:12   ` Theodore Ts'o
  2016-12-13  3:30     ` Michael S. Tsirkin
@ 2016-12-13  3:43     ` Jason Wang
  2016-12-13  4:19       ` Theodore Ts'o
  1 sibling, 1 reply; 10+ messages in thread
From: Jason Wang @ 2016-12-13  3:43 UTC (permalink / raw)
  To: Theodore Ts'o, Michael S. Tsirkin; +Cc: netdev, nhorman, davem



On 2016年12月13日 11:12, Theodore Ts'o wrote:
> On Tue, Dec 13, 2016 at 04:28:17AM +0200, Michael S. Tsirkin wrote:
>> That's unfortunate, of course. It could be a hypervisor or
>> a guest kernel bug. ideas:
>> - does host have mq capability? how many queues?
>> - how about # of msix vectors?
>> - after you send something on tx queues,
>>    are interrupts arriving on rx queues?
>> - is problem rx or tx?
>>    set ip and arp manually and send a packet to known MAC,
>>    does it get there?
> Sorry, I don't know how to debug virtio-net.  Given that it's in a
> cloud environment, I also can't set ip addresses manually, since ip
> addresses are set manually.
>
> If you can send me a patch, I'm happy to apply it and send you back
> results.
>
> I can say that I've had _zero_ problems using pretty much any kernel
> from 3.10 to 4.9 using Google Compute Engine.  The commit I referenced
> caused things to stop working.  So in terms of regression, this is
> definitely a regression, and it's definitely caused by commit
> 449000102901.  Even if it is a hypervisor "bug", I'm pretty sure I
> know what Linus will say if I ask him to revert it.  Linux kernels are
> expected to work around hardware bugs, and breaking users just because
> hardware is "broken" by some definition is generally not considered
> friendly, especially when has been working for years and years before
> some commit "fixed" things.
>
> I would very much like to work with you to fix it, but I will need
> your help, since virtio-net doesn't seem to print any informational
> during the boot sequence, and I don't know how the best way to debug
> it.
>
> Cheers,
>
> 						- Ted

Thanks for reporting this issue. Looks like I blindly set the affinity 
instead of queues during probe. Could you please try the following patch 
to see if it works?

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b425fa1..fe9f772 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1930,7 +1930,9 @@ static int virtnet_probe(struct virtio_device *vdev)
                 goto free_unregister_netdev;
         }

-       virtnet_set_affinity(vi);
+       rtnl_lock();
+       virtnet_set_queues(vi, vi->curr_queue_pairs);
+       rtnl_unlock();

         /* Assume link up if device can't report link status,
            otherwise get link status from config. */

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-13  3:43     ` Jason Wang
@ 2016-12-13  4:19       ` Theodore Ts'o
  0 siblings, 0 replies; 10+ messages in thread
From: Theodore Ts'o @ 2016-12-13  4:19 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S. Tsirkin, netdev, nhorman, davem

On Tue, Dec 13, 2016 at 11:43:00AM +0800, Jason Wang wrote:
> Thanks for reporting this issue. Looks like I blindly set the affinity
> instead of queues during probe. Could you please try the following patch to
> see if it works?

This fixed things, thanks!!

						- Ted
						

> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index b425fa1..fe9f772 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1930,7 +1930,9 @@ static int virtnet_probe(struct virtio_device *vdev)
>                 goto free_unregister_netdev;
>         }
> 
> -       virtnet_set_affinity(vi);
> +       rtnl_lock();
> +       virtnet_set_queues(vi, vi->curr_queue_pairs);
> +       rtnl_unlock();
> 
>         /* Assume link up if device can't report link status,
>            otherwise get link status from config. */
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-12 23:33 "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE Theodore Ts'o
  2016-12-13  2:28 ` Michael S. Tsirkin
@ 2016-12-13 17:46 ` Wei Xu
  2016-12-13 19:44   ` Theodore Ts'o
  1 sibling, 1 reply; 10+ messages in thread
From: Wei Xu @ 2016-12-13 17:46 UTC (permalink / raw)
  To: Theodore Ts'o, jasowang; +Cc: netdev, mst, nhorman, davem


On 2016年12月13日 07:33, Theodore Ts'o wrote:
> Hi,
>
> I was doing a last minute regression test of the ext4 tree before
> sending a pull request to Linus, which I do using gce-xfstests[1], and
> I found that using networking was broken on GCE on linux-next.  I was
> using next-20161209, and after bisecting things, I narrowed down the
> commit which causing things to break to commit 449000102901:
> "virtio-net: enable multiqueue by default".  Reverting this commit on
> top of next-20161209 fixed the problem.
>
> [1] http://thunk.org/gce-xfstests
>
> You can reproduce the problem for building the kernel for Google
> Compute Engine --- I use a config such as this [2], and then try to
> boot a kernel on a VM.  The way I do this involves booting a test
> appliance and then kexec'ing into the kernel to be tested[3], using a
> 2cpu configuration.  (GCE machine type: n1-standard-2)
>
> [2] https://git.kernel.org/cgit/fs/ext2/xfstests-bld.git/tree/kernel-configs/ext4-x86_64-config-4.9
> [3] https://github.com/tytso/xfstests-bld/blob/master/Documentation/gce-xfstests.md
>
> You can then take a look at serial console using a command such as
> "gcloud compute instances get-serial-port-output <instance-name>", and
> you will get something like this (see attached).  The important bit is
> that the dhclient command is completely failing to be able to get a
> response from the network, from which I deduce that apparently that
> either networking send or receive or both seem to be badly affected by
> the commit in question.
>
> Please let me know if there's anything I can do to help you debug this
> further.

Hi Ted,
Just had a quick try on GCE, sorry for my stupid questions.

Q1:
Which distribution are you using for the GCE instance?

Q2:
Are you running xfs test as an embedded VM case, which means XFS test
appliance is also a VM inside the GCE instance? Or the kernel is built
for the instance itself?

Q3:
Can this bug be reproduced for kvm-xfstests case? I'm trying to set up
a local test bed if it makes sense.

>
> Cheers,
>
> 						- Ted
>
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] Linux version 4.9.0-rc8-ext4-06387-g03e5cbd (tytso@tytso-ssd) (gcc version 4.9.2 (Debian 4.9.2-10) ) #9 SMP Mon Dec 12 04:50:16 UTC 2016
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] Command line: root=/dev/sda1 ro console=ttyS0,38400n8 elevator=noop console=ttyS0  fstestcfg=4k fstestset=-g,quick fstestexc= fstestopt=aex fstesttyp=ext4 fstestapi=1.3
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
> Dec 11 23:53:20 xfstests-201612120451 kernel: [    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Load Kernel Modules.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Apply Kernel Variables...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounting Configuration File System...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounting FUSE Control File System...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounted FUSE Control File System.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Mounted Configuration File System.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Apply Kernel Variables.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Create Static Device Nodes in /dev.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting udev Kernel Device Manager...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Kernel Device Manager.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Coldplug all Devices.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting udev Wait for Complete Device Initialization...
> Dec 11 23:53:20 xfstests-201612120451 systemd-fsck[1659]: xfstests-root: clean, 56268/655360 files, 357439/2620928 blocks
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started File System Check on Root Device.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Remount Root and Kernel File Systems...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Remount Root and Kernel File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Various fixups to make systemd work better on Debian.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Load/Save Random Seed...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Local File Systems (Pre).
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Local File Systems (Pre).
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Load/Save Random Seed.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started udev Wait for Complete Device Initialization.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Activation of LVM2 logical volumes...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Copy rules generated while the root was ro...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS0.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS1.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Copy rules generated while the root was ro.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS2.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Found device /dev/ttyS3.
> Dec 11 23:53:20 xfstests-201612120451 systemd-udevd[2568]: could not open moddep file '/lib/modules/4.9.0-rc8-ext4-06387-g03e5cbd/modules.dep.bin'
> Dec 11 23:53:20 xfstests-201612120451 lvm[2579]: No volume groups found
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Activation of LVM2 logical volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Encrypted Volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Encrypted Volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Activation of LVM2 logical volumes...
> Dec 11 23:53:20 xfstests-201612120451 lvm[2625]: No volume groups found
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Activation of LVM2 logical volumes.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling...
> Dec 11 23:53:20 xfstests-201612120451 lvm[2627]: No volume groups found
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Local File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Local File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Remote File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Reached target Remote File Systems.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Create Volatile Files and Directories...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting LSB: Generate ssh host keys if they do not exist...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting LSB: Raise network interfaces....
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Trigger Flushing of Journal to Persistent Storage.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Create Volatile Files and Directories.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started LSB: Generate ssh host keys if they do not exist.
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Starting Update UTMP about System Boot/Shutdown...
> Dec 11 23:53:20 xfstests-201612120451 systemd[1]: Started Update UTMP about System Boot/Shutdown.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Internet Systems Consortium DHCP Client 4.3.1
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Copyright 2004-2014 Internet Systems Consortium.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: All rights reserved.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: For info, please visit https://www.isc.org/software/dhcp/
> Dec 11 23:53:20 xfstests-201612120451 dhclient:
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Configuring network interfaces...Internet Systems Consortium DHCP Client 4.3.1
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Copyright 2004-2014 Internet Systems Consortium.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: All rights reserved.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: For info, please visit https://www.isc.org/software/dhcp/
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Listening on LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Sending on   LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Sending on   Socket/fallback
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Listening on LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Sending on   LPF/eth0/42:01:0a:f0:00:03
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Sending on   Socket/fallback
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCP[^[[32m  OK  ^[[0m] DISCOVER on eth0 to 255.255.255.255 port 67 interval 8
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 17
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 17
> Dec 11 23:53:20 xfstests-201612120451 dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 15
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 15
> Dec 11 23:53:20 xfstests-201612120451 dhclient: No DHCPOFFERS received.
> Dec 11 23:53:20 xfstests-201612120451 dhclient: Trying recorded lease 10.240.0.3
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: No DHCPOFFERS received.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: Trying recorded lease 10.240.0.3
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: connect: Network is unreachable
> Dec 11 23:53:20 xfstests-201612120451 logger: /etc/dhcp/dhclient-exit-hooks returned non-zero exit status 2
> Dec 11 23:53:20 xfstests-201612120451 dhclient: bound: renewal in 38598 seconds.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: bound: renewal in 38598 seconds.
> Dec 11 23:53:20 xfstests-201612120451 networking[2633]: done.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-13 17:46 ` Wei Xu
@ 2016-12-13 19:44   ` Theodore Ts'o
  2016-12-14  4:24     ` Wei Xu
  0 siblings, 1 reply; 10+ messages in thread
From: Theodore Ts'o @ 2016-12-13 19:44 UTC (permalink / raw)
  To: Wei Xu; +Cc: jasowang, netdev, mst, nhorman, davem

Jason's patch fixed the issue, so I think we have the proper fix, but
to answer your questions:

On Wed, Dec 14, 2016 at 01:46:44AM +0800, Wei Xu wrote:
> 
> Q1:
> Which distribution are you using for the GCE instance?

The test appliance is based on Debian Jessie.

> Q2:
> Are you running xfs test as an embedded VM case, which means XFS test
> appliance is also a VM inside the GCE instance? Or the kernel is built
> for the instance itself?

No, GCE currently doesn't support running nested VM's (e.g., running
VM's inside GCE).  So the kernel is built for the instance itself.
The way the test appliance works is that it initially boots using the
Debian Jessie default kernel and then we kexec into the kernel under
test.

> Q3:
> Can this bug be reproduced for kvm-xfstests case? I'm trying to set up
> a local test bed if it makes sense.

You definitely can't do it out of the box -- you need to build the
image using "gen-image --networking", and then run "kvm-xfstests -N
shell" as root.  But the bug doesn't reproduce on kvm-xfstests, using
a 4.9 host kernel and linux-next guest kernel.


Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-13 19:44   ` Theodore Ts'o
@ 2016-12-14  4:24     ` Wei Xu
  2016-12-14 16:42       ` Theodore Ts'o
  0 siblings, 1 reply; 10+ messages in thread
From: Wei Xu @ 2016-12-14  4:24 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: jasowang, netdev, mst, nhorman, davem

On 2016年12月14日 03:44, Theodore Ts'o wrote:
> Jason's patch fixed the issue, so I think we have the proper fix, but
> to answer your questions:
>
> On Wed, Dec 14, 2016 at 01:46:44AM +0800, Wei Xu wrote:
>>
>> Q1:
>> Which distribution are you using for the GCE instance?
>
> The test appliance is based on Debian Jessie.
>
>> Q2:
>> Are you running xfs test as an embedded VM case, which means XFS test
>> appliance is also a VM inside the GCE instance? Or the kernel is built
>> for the instance itself?
>
> No, GCE currently doesn't support running nested VM's (e.g., running
> VM's inside GCE).  So the kernel is built for the instance itself.
> The way the test appliance works is that it initially boots using the
> Debian Jessie default kernel and then we kexec into the kernel under
> test.
>
>> Q3:
>> Can this bug be reproduced for kvm-xfstests case? I'm trying to set up
>> a local test bed if it makes sense.
>
> You definitely can't do it out of the box -- you need to build the
> image using "gen-image --networking", and then run "kvm-xfstests -N
> shell" as root.  But the bug doesn't reproduce on kvm-xfstests, using
> a 4.9 host kernel and linux-next guest kernel.
>

OK, thanks a lot.

BTW, although this is a guest issue, is there anyway to view the GCE
host kernel or qemu(if it is) version?

>
> Cheers,
>
> 					- Ted
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE
  2016-12-14  4:24     ` Wei Xu
@ 2016-12-14 16:42       ` Theodore Ts'o
  0 siblings, 0 replies; 10+ messages in thread
From: Theodore Ts'o @ 2016-12-14 16:42 UTC (permalink / raw)
  To: Wei Xu; +Cc: jasowang, netdev, mst, nhorman, davem

On Wed, Dec 14, 2016 at 12:24:43PM +0800, Wei Xu wrote:
> 
> BTW, although this is a guest issue, is there anyway to view the GCE
> host kernel or qemu(if it is) version?

No, there isn't, as far as I know.

    	  	    	     - Ted

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-12-14 16:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-12 23:33 "virtio-net: enable multiqueue by default" in linux-next breaks networking on GCE Theodore Ts'o
2016-12-13  2:28 ` Michael S. Tsirkin
2016-12-13  3:12   ` Theodore Ts'o
2016-12-13  3:30     ` Michael S. Tsirkin
2016-12-13  3:43     ` Jason Wang
2016-12-13  4:19       ` Theodore Ts'o
2016-12-13 17:46 ` Wei Xu
2016-12-13 19:44   ` Theodore Ts'o
2016-12-14  4:24     ` Wei Xu
2016-12-14 16:42       ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.