All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pasi Kärkkäinen" <pasik@iki.fi>
To: "Xu, Dongxiao" <dongxiao.xu@intel.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Christophe Saout <christophe@saout.de>
Subject: Re: new netfront and occasional receive path lockup
Date: Thu, 9 Sep 2010 21:50:58 +0300	[thread overview]
Message-ID: <20100909185058.GR2804@reaktio.net> (raw)
In-Reply-To: <D5AB6E638E5A3E4B8F4406B113A5A19A2A44184D@shsmsx501.ccr.corp.intel.com>

On Wed, Aug 25, 2010 at 08:51:09AM +0800, Xu, Dongxiao wrote:
> Hi Christophe,
> 
> Thanks for finding and checking the problem.
> I will try to reproduce the issue and check what caused the problem.
> 

Hello,

Was this issue resolved? Some users have been complaining
"network freezing up" issues recently on ##xen on irc..

-- Pasi

> Thanks,
> Dongxiao
> 
> Jeremy Fitzhardinge wrote:
> >  On 08/22/2010 09:43 AM, Christophe Saout wrote:
> >> Hi,
> >> 
> >> I've been playing with some of the new pvops code, namely DomU guest
> >> code.  What I've been observing on one of the virtual machines is
> >> that 
> >> the network (vif) is dying after about ten to sixty minutes of
> >> uptime. 
> >> The unfortunate thing here is that I can only repoduce it on a
> >> production VM and have been unlucky so far to trigger the bug on a
> >> test machine.  While this has not been tragic - rebooting fixed the
> >> issue, unfortunately I can't spend very much time on debugging after
> >> the issue pops up.
> > 
> > Ah, OK.  I've seen this a couple of times as well.  And it just
> > happened to me then... 
> > 
> > 
> >> Now, what is happening is that the receive path goes dead.  The DomU
> >> can send packets to Dom0 and those are visible using tcpdump on the
> >> Dom0 on the virtual interface, but not the other way around.
> > 
> > I hadn't got to that level of diagnosis, but I can confirm that
> > that's what seems to be happening here too. 
> > 
> >> Now, I have done more than one change at a time (I'd like to avoid
> >> going into pinning it down since I can only reproduce it on a
> >> production machine, as I said, so suggestions are welcome), but my
> >> suspicion is that it might have to do with the new "smart polling"
> >> feature in xen/netfront.  Note that I have also updated Dom0 to pull
> >> in the latest dom0/backend and netback changes, just to make sure
> >> it's 
> >> not due to an issue that has been fixed there, but I'm still seeing
> >> the same. 
> > 
> > I agree.  I think I started seeing this once I merged smartpoll into
> > netfront. 
> > 
> >     J
> > 
> >> The production machine is a machine that doesn't have much network
> >> load, but deals with a lot of small network requests (DNS and smtp
> >> mostly).  A workload which is hard to reproduce on the test machine.
> >> Heavy network load (NFS, FTP and so on) for days hasn't triggered the
> >> problem.  Also, segmentation offloading and similar settings don't
> >> have any effect. 
> >> 
> >> The machine has 2 physical and the VM 2 virtual CPUs, DomU has
> >> PREEMPT 
> >> enabled.
> >> 
> >> I've been looking at the code, if there might be a race condition
> >> somewhere, something like where one could run into a situation where
> >> the hrtimer doesn't run and Dom0 believes the DomU should be polling
> >> and doesn't emit an interrupt or something, but I'm afraid I don't
> >> know enough to judge this (I mean, there are spinlocks which look
> >> safe 
> >> to me).
> >> 
> >> Do you have any suggestions what to try?  I can trigger the issue on
> >> the production VM again, but debugging should not take more than a
> >> few 
> >> minutes if it happens.  Access is only possible via the console.
> >> Neither Dom0 nor the guest show anything unusual in the kernel
> >> message 
> >> and continue to behave normally after the network goes dead (also
> >> able 
> >> to shut down the guest normally).
> >> 
> >> Thanks,
> >> 	Christophe
> >> 
> >> 
> >> 
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xensource.com
> >> http://lists.xensource.com/xen-devel
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

  reply	other threads:[~2010-09-09 18:50 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-22 16:43 new netfront and occasional receive path lockup Christophe Saout
2010-08-22 18:37 ` Christophe Saout
2010-08-24  0:53   ` Jeremy Fitzhardinge
2010-08-23 14:26 ` Christophe Saout
2010-08-23 16:04   ` Konrad Rzeszutek Wilk
2010-08-23 17:09     ` Christophe Saout
2010-08-24  0:46 ` Jeremy Fitzhardinge
2010-08-25  0:51   ` Xu, Dongxiao
2010-09-09 18:50     ` Pasi Kärkkäinen [this message]
2010-09-10  0:55       ` Jeremy Fitzhardinge
2010-09-10  1:45         ` Xu, Dongxiao
2010-09-10  2:25           ` Jeremy Fitzhardinge
2010-09-10  2:37             ` Xu, Dongxiao
2010-09-10  2:42               ` Jeremy Fitzhardinge
2010-09-12  1:00           ` Gerald Turner
2010-09-12  8:55             ` Jeremy Fitzhardinge
2010-09-12 17:23               ` Pasi Kärkkäinen
2010-09-12 22:40               ` Gerald Turner
2010-09-13  0:03                 ` Gerald Turner
2010-09-13  0:54                   ` Xu, Dongxiao
2010-09-13  2:12                     ` Gerald Turner
2010-09-13  2:34                       ` Xu, Dongxiao
2010-09-13  4:38                         ` Gerald Turner
2010-09-13 16:01                           ` Gerald Turner
2010-09-13 16:08                             ` Pasi Kärkkäinen
2010-09-13 19:36                               ` Jeremy Fitzhardinge
2010-09-14  8:25                                 ` Ian Campbell
2010-09-14 17:54                                   ` Jeremy Fitzhardinge
2010-09-14 18:44                                     ` Pasi Kärkkäinen
2010-09-15  9:46                                       ` Ian Campbell
2010-09-14  0:26                             ` Xu, Dongxiao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100909185058.GR2804@reaktio.net \
    --to=pasik@iki.fi \
    --cc=christophe@saout.de \
    --cc=dongxiao.xu@intel.com \
    --cc=jeremy@goop.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.