From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: persistent tun & different virtual NICs & dead guest network Date: Sun, 05 Apr 2009 14:58:48 +0300 Message-ID: <49D89CF8.8040200@redhat.com> References: <49D735D6.3070803@msgid.tls.msk.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: KVM list , qemu-devel To: Michael Tokarev Return-path: Received: from mx2.redhat.com ([66.187.237.31]:50228 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750971AbZDEL66 (ORCPT ); Sun, 5 Apr 2009 07:58:58 -0400 In-Reply-To: <49D735D6.3070803@msgid.tls.msk.ru> Sender: kvm-owner@vger.kernel.org List-ID: (cc qemu-devel) Michael Tokarev wrote: > Hello. > > 2 days debugging an.. issue here, and finally got it. > To make the long and painful (it was for me anyway) > story short... > > kvm provides a way to control various offload settings > on the "host side" of the tun network device (I mean > the `-net tap' setup) from within guest. I.e., guest > can set/clear various offload bits according to its > capabilities/wishes. > > The problem is that different virtual NICs as used by > kvm/qemu expects and sets different offload bits for > the virtual NIC. And sets only those bits which - > as they "think" - differs from the default (all-off). > > This means that when changing virtual NIC model AND > using persistent tun device, it's very likely to get > inconsistent flags. > > For example, here's how the offload settings on the > host looks like after using e1000 driver in guest > (freshly created persistent tun device): > > rx-checksumming: on > tx-checksumming: on > scatter-gather: on > tcp segmentation offload: on > udp fragmentation offload: off > generic segmentation offload: off > large receive offload: off > > Here's the same setting when using virtio_net > instead: > > rx-checksumming: on > tx-checksumming: off > scatter-gather: off > tcp segmentation offload: off > udp fragmentation offload: off > generic segmentation offload: off > large receive offload: off > > I.e., only rx-checksumming. When using virtio_net > from 2.6.29, which supports LRO, it also turns on > large receive offload. > > Now, say, I tried a host with e1000 driver, and it > turned on tx, sg and tso bits. And now I'm trying > to run a guest with new virtio-net NIC instead. It > turns on lro bit, but the network does not work anyway: > almost any packet that's being sent from host to the > guest has incorrect checksum - because the NIC is marked > as able to do tx-checksumming but it does not do it. > The network is dead. > > Now, after trying that and this, not understanding > what's going on etc, let's reboot back with e1000 > NIC which worked a few minutes ago... just to discover > that it does not work anymore too! Because previous > attempt with virtio_net resulted in lro being on, but > the driver does not support it! So now, we've non- > working network again, and now, it does not matter > which driver we'll try: neither of them will work > because the offload settings are broken. > > It's more: one can't control this stuff from the > host side using standard ethtool: it says that > the operation is not supported (I wonder how kvm > performs the settings changes). > > The solution here is to re-create the tun device > before changing the virtual NIC model. But it > isn't always possible, esp. when guests are > being run from non-root user (where persistent > tun devices are most useful). > > Can this be fixed somehow please? > > I think all the settings should be reset to 0 > when opening the tun device. This should definitely be fixed. I'll look at writing a patch. -- error compiling committee.c: too many arguments to function