From mboxrd@z Thu Jan 1 00:00:00 1970 From: "SourceForge.net" Subject: [ kvm-Bugs-2506814 ] TAP network lockup after some traffic Date: Wed, 06 May 2009 15:07:33 +0000 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" To: noreply@sourceforge.net Return-path: Received: from ch3.sourceforge.net ([216.34.181.60]:17669 "EHLO ch3.sourceforge.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755490AbZEFPHg (ORCPT ); Wed, 6 May 2009 11:07:36 -0400 Sender: kvm-owner@vger.kernel.org List-ID: Bugs item #2506814, was opened at 2009-01-14 11:38 Message generated for change (Comment added) made by mellen You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2506814&group_id=180599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Fabio Coatti (cova) Assigned to: Nobody/Anonymous (nobody) Summary: TAP network lockup after some traffic Initial Comment: Hi all, we are experiencing severe network troubles using kvm+tap networking. basically, after some network load (we have yet to identify the exact amount of traffic if one exist) network stops working. During lockups, With tcpdump we can see arp requests on guest interface, then on tap, brX and physical interfaces on host system. the arp answers can be seen, with tcpdump, only on physical host interface and bridge (brX), but not on tap device. Basically it seems that packets coming from external network get lost in tap device on the way to guest (kvm). Looking at tap data with ifconfig, the only weird thing is the TX packets overrun count that is > 0. every time the network stops working, overrun count increases. This has been observed with several kvm releases (for sure, 76/77/78/79/80/81/82) and with different kernels (tried with some versions among 2.6.25.X, 26.X, 27.X, 28) both on guest and host side. we tried several network drivers (virtio, e1000, rtl) and all shows the same problem. Only 100Mbit drivers seems to be unaffected so far. (only virtio has acceptable performance) (btw: on host machine we have vlan on top of ethX devices) cpu number on guest makes no difference. we tried with vanilla kernel provided kvm modules and with kvm package provided modules. guest: 32 bit host: 64 bit host machine: 2 x Quad-Core AMD Opteron(tm) Processor 2352 16GB, gentoo. Of course I can provide more details or perform other tests and try patches, if someone can give me some hints and advices they will be most welcome. Thanks. ---------------------------------------------------------------------- Comment By: Tais M. Hansen (mellen) Date: 2009-05-06 17:07 Message: I'm curious about the status of this issue? I'm experiencing the same problem apparently randomly on guests. Restarting the network interface does not seem to fix the problem. Only a reboot (or system_reset in kvm/qemu console) solves it. Last time it happened was on a host with kvm-84 and guest with kernel 2.6.27 using virtio-net. Leading up to the stall it had a traffic load of about 5 mbit constantly up and down for just over 2 hours with one 12 mbit spikes. I did not check interface counters or network traces at the time. ---------------------------------------------------------------------- Comment By: Fabio Coatti (cova) Date: 2009-01-14 14:24 Message: It seems quite similar indeed, but ip link set eth up/down on guest side seems to have no effect. Besides that, looking at the thread you pointed me, I can see another difference: sniffing on tap device in my case shows only outgoing packets (i.e. leaving kvm guest). So it can be the very same issue, but some differences are present. At least, we are seeing this on newer kernels and kvm revisions :) ---------------------------------------------------------------------- Comment By: Mark McLoughlin (markmc) Date: 2009-01-14 12:19 Message: Does ifup/ifdown in the guest fix the hang? If so, it sounds like the issue discussed in this long thread: http://www.mail-archive.com/kvm@vger.kernel.org/msg06774.html We still haven't got to the bottom of it. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=893831&aid=2506814&group_id=180599