All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Anthony Wright <anthony@overnetdata.com>
Cc: Ian Campbell <Ian.Campbell@eu.citrix.com>,
	Todd Deshane <todd.deshane@xen.org>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: Kernel bug from 3.0 (was phy disks and vifs timing out in DomU)
Date: Fri, 26 Aug 2011 10:44:38 -0400	[thread overview]
Message-ID: <20110826144438.GA24836@dumpdata.com> (raw)
In-Reply-To: <20110826142606.GA25511@dumpdata.com>

On Fri, Aug 26, 2011 at 10:26:06AM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Aug 25, 2011 at 09:31:46PM +0100, Anthony Wright wrote:
> > On 19/08/2011 13:56, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Aug 19, 2011 at 11:22:15AM +0100, Anthony Wright wrote:
> > >> On 03/08/2011 16:28, Konrad Rzeszutek Wilk wrote:
> > >>> On Fri, Jul 29, 2011 at 08:53:02AM +0100, Anthony Wright wrote:
> > >>>> I've just upgraded to xen 4.1.1 with a stock 3.0 kernel on dom0 (with
> > >>>> the vga-support patch backported). I can't get my DomU's to work due to
> > >>>> the phy disks and vifs timing out in DomU and looking through my logs
> > >>>> this morning I'm getting a consistent kernel bug report with xen
> > >>>> mentioned at the top of the stack trace and vifdisconnect mentioned on
> > >>> Yikes! Ian any ideas what to try?
> > >>>
> > >>> Anthony, can you compile the kernel with debug=y and when this happens
> > >>> see what 'xl dmesg' gives? Also there is also the 'xl debug-keys g' which
> > >>> should dump the grants in use.. that might help a bit.
> > >> I've compiled a 3.0.1 kernel with CONFIG_DEBUG=Y (a number of other
> > >> config values appeared at this point, and I took defaults for them).
> > >>
> > >> The output from /var/log/messages & 'xl dmesg' is attached. There was no
> > >> output from 'xl debug-keys g'.
> > > Ok, so I am hitting this too - I was hoping that the patch from Stefano
> > > would have fixed the issue, but sadly it did not.
> > >
> > > Let me (I am traveling right now) see if I can come up with an internim
> > > solution until Ian comes with the right fix.
> > >
> > Hi Konrad - any progress on this - it's a bit of a show stopper for me.
> 
> What is interesting is that it happens only with 32-bit guests and with
> not-so fast hardware: Atom D510 for me and in your case MSI MS-7309 motherboard
> (with what kind of processor?). I've a 64-bit hypervisor - not sure if you
> are using a 32-bit or 64-bit.
> 
> I hadn't tried to reproduce this on the Atom D510 with a 64-bit Dom0.
> But I was wondering if you had this setup before - with a 64-bit dom0?
> Or is that really not an option with your CPU?

So while I am still looking at the hypervisor code to figure out why
it would give me:

(XEN) mm.c:3846:d0 Could not find L1 PTE for address fbb42000

I've cobbled this patch^H^H^Hhack to retry the transaction to see if this is
a tempory issue (race) or really - somehow that L1 PTE is gone.

If you could, can you try it out and see if the errors that are spit
are repeated - mainly the "Could not find L1 PTE". You will need to
run the hypervisor with "loglvl=all" to get that information.

to compile the hypervisor with debug=y to get that

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index fd00f25..7bee981 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1607,7 +1607,7 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	struct gnttab_map_grant_ref op;
 	struct xen_netif_tx_sring *txs;
 	struct xen_netif_rx_sring *rxs;
-
+	int retry = 3;
 	int err = -ENOMEM;
 
 	vif->tx_comms_area = alloc_vm_area(PAGE_SIZE);
@@ -1620,7 +1620,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 
 	gnttab_set_map_op(&op, (unsigned long)vif->tx_comms_area->addr,
 			  GNTMAP_host_map, tx_ring_ref, vif->domid);
-
+	op.status = 0;
+retry_tx:
 	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
 		BUG();
 
@@ -1628,6 +1629,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 		netdev_warn(vif->dev,
 			    "failed to map tx ring. err=%d status=%d\n",
 			    err, op.status);
+		if (retry-- > 0)
+			goto retry_tx;
 		err = op.status;
 		goto err;
 	}
@@ -1641,6 +1644,9 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 	gnttab_set_map_op(&op, (unsigned long)vif->rx_comms_area->addr,
 			  GNTMAP_host_map, rx_ring_ref, vif->domid);
 
+	retry = 3;
+	op.status = 0;
+retry_rx:
 	if (HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, &op, 1))
 		BUG();
 
@@ -1648,6 +1654,8 @@ int xen_netbk_map_frontend_rings(struct xenvif *vif,
 		netdev_warn(vif->dev,
 			    "failed to map rx ring. err=%d status=%d\n",
 			    err, op.status);
+		if (retry-- > 0)
+			goto retry_rx;
 		err = op.status;
 		goto err;
 	}
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

  reply	other threads:[~2011-08-26 14:44 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <29902981.10.1311837224851.JavaMail.root@zimbra.overnetdata.com>
2011-07-28  7:24 ` phy disks and vifs timing out in DomU Anthony Wright
2011-07-28 15:01   ` Todd Deshane
2011-07-28 15:36     ` Anthony Wright
2011-07-28 15:46       ` Todd Deshane
2011-07-28 16:00         ` Anthony Wright
2011-07-29 15:55           ` Konrad Rzeszutek Wilk
2011-07-29 18:40             ` Anthony Wright
2011-07-29 20:01               ` Konrad Rzeszutek Wilk
2011-07-30 17:05                 ` Anthony Wright
2011-08-01 11:03                   ` Anthony Wright
2011-07-28 16:28       ` Ian Campbell
2011-07-29  7:53         ` Kernel bug from 3.0 (was phy disks and vifs timing out in DomU) Anthony Wright
2011-08-03 15:28           ` Konrad Rzeszutek Wilk
2011-08-09 16:35             ` Konrad Rzeszutek Wilk
2011-08-19 10:22             ` Anthony Wright
2011-08-19 12:56               ` Konrad Rzeszutek Wilk
2011-08-22 11:02                 ` Anthony Wright
2011-08-25 20:31                 ` Anthony Wright
2011-08-26 14:26                   ` Konrad Rzeszutek Wilk
2011-08-26 14:44                     ` Konrad Rzeszutek Wilk [this message]
2011-08-29 12:13                       ` Anthony Wright
2011-08-31 16:58                       ` David Vrabel
2011-08-31 17:07                         ` Konrad Rzeszutek Wilk
2011-09-01  7:42                           ` Ian Campbell
2011-09-01 14:23                             ` Konrad Rzeszutek Wilk
2011-09-01 15:12                               ` David Vrabel
2011-09-01 15:37                                 ` Konrad Rzeszutek Wilk
2011-09-01 15:43                                   ` Ian Campbell
2011-09-01 16:07                                     ` Konrad Rzeszutek Wilk
2011-09-07 12:57                                 ` Anthony Wright
2011-09-07 18:35                                   ` Konrad Rzeszutek Wilk
2011-09-01 15:12                               ` Ian Campbell
2011-09-01 15:38                                 ` Konrad Rzeszutek Wilk
2011-09-01 15:44                                   ` Ian Campbell
2011-09-01 17:34                                     ` Jeremy Fitzhardinge
2011-09-01 19:19                                       ` Ian Campbell
2011-09-01 17:32                             ` Jeremy Fitzhardinge
2011-09-01 19:21                               ` Ian Campbell
2011-09-01 20:34                                 ` Jeremy Fitzhardinge
2011-09-02  7:17                                   ` Ian Campbell
2011-09-02 20:26                                     ` Jeremy Fitzhardinge
2011-09-03 10:27                                       ` Ian Campbell
2011-09-23 12:35                                         ` Anthony Wright
2011-09-23 12:49                                           ` David Vrabel
2011-08-29 17:33                     ` Anthony Wright
2011-08-25 21:11                 ` Anthony Wright
2011-08-26  7:10                   ` Sander Eikelenboom
2011-08-26 11:23                     ` Pasi Kärkkäinen
2011-08-26 12:16                   ` Stefano Stabellini
2011-08-26 12:15                     ` Anthony Wright
2011-08-26 12:32                       ` Stefano Stabellini
2011-07-29 15:48         ` phy disks and vifs timing out in DomU (only on certain hardware) Anthony Wright
2011-07-29 16:06           ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110826144438.GA24836@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Campbell@eu.citrix.com \
    --cc=anthony@overnetdata.com \
    --cc=todd.deshane@xen.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.