All of lore.kernel.org
 help / color / mirror / Atom feed
* New release candidates for Xen 3.3.2 and 3.4.1
@ 2009-06-23 10:44 Keir Fraser
  2009-06-23 16:14 ` Frank van der Linden
  0 siblings, 1 reply; 5+ messages in thread
From: Keir Fraser @ 2009-06-23 10:44 UTC (permalink / raw)
  To: xen-devel

Folks,

New release candidates are tagged:
http://xenbits.xensource.com/xen-3.3-testing.hg tagged 3.3.2-rc3
http://xenbits.xensource.com/xen-3.4-testing.hg tagged 3.4.1-rc4

Please test! I hope to release later this week.

 -- Keir

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New release candidates for Xen 3.3.2 and 3.4.1
  2009-06-23 10:44 New release candidates for Xen 3.3.2 and 3.4.1 Keir Fraser
@ 2009-06-23 16:14 ` Frank van der Linden
  2009-06-23 17:24   ` Keir Fraser
  0 siblings, 1 reply; 5+ messages in thread
From: Frank van der Linden @ 2009-06-23 16:14 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2008 bytes --]

Keir Fraser wrote:
> Folks,
>
> New release candidates are tagged:
> http://xenbits.xensource.com/xen-3.3-testing.hg tagged 3.3.2-rc3
> http://xenbits.xensource.com/xen-3.4-testing.hg tagged 3.4.1-rc4
>
> Please test! I hope to release later this week.
>
>  -- Keir
>   
We are putting our 3.3.2-rc3 based tree through some testing, and are 
seeing what looks like a shadow page table issue.

This issue happens with Solaris (both Solaris 10 and more recent 
OpenSolaris-based builds), running in a HVM domain. The domain has 3 or 
more VCPUs (4 is the usual number). At some point, the Solaris kernel 
will panic. The pattern always looks the same: the Solaris kernel 
allocates some kmem, and then touches it shortly afterwards (e.g. to 
zero it out, write a 0xbadcafe debug pattern to it, etc). When it 
touches the memory, it gets a fatal pagefault (page not present).

However, when inspecting the state of the guest page tables, they all 
look fine. The page is mapped, as far as the guest is concerned. That 
means that the shadow page table code must have gotten it wrong. An 
additional data point that points in this direction is that on a system 
that very reliably reproduces the problem, setting hap=1 makes the 
problem go away (the problem is reproduced by doing a virt-install, 
which doesn't set hap to 1 by default).

I'm trying to narrow the circumstances down to get some useful data out 
of this; I'll try disabling the out of sync optimizations in the shadow 
code, etc. I've tried to add more instrumentation to the shadow code, 
but this often changes the timing just enough to avoid the bug.

Our code has no changes in the shadow page table code. So far, we 
haven't tried to reproduce it on our 3.4-based tree yet (we need a 
better way to reproduce it in a more controlled environment for that).

I've filed bug #1480 for this.

Oh, and speaking of the out of sync option, Xen doesn't compile if it's 
disabled, because some ifdefs use && instead of &. Patch attached.

- Frank



[-- Attachment #2: shadow-no-sync --]
[-- Type: text/plain, Size: 1487 bytes --]

diff -r ed718c13f651 xen/arch/x86/mm/shadow/multi.c
--- a/xen/arch/x86/mm/shadow/multi.c	Sun Jun 21 20:19:07 2009 -0700
+++ b/xen/arch/x86/mm/shadow/multi.c	Tue Jun 23 09:09:11 2009 -0700
@@ -2048,7 +2048,7 @@
         if ( r & SHADOW_SET_ERROR )
             return NULL;
 
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
         *resync |= 1;
 #endif
 
@@ -2103,7 +2103,7 @@
         if ( r & SHADOW_SET_ERROR )
             return NULL;        
 
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
         *resync |= 1;
 #endif
 
@@ -2200,7 +2200,7 @@
         (void) shadow_l1_index(sl1mfn, guest_l1_table_offset(gw->va));
     }
 
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
     /* All pages walked are now pagetables. Safe to resync pages
        in case level 4 or 3 shadows were set. */
     if ( resync )
@@ -2482,7 +2482,7 @@
         else
             result |= SHADOW_SET_ERROR;
 
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
         if ( mfn_valid(sl3mfn) )
             shadow_resync_all(v, 0);
 #endif
@@ -2539,7 +2539,7 @@
         else
             result |= SHADOW_SET_ERROR;
 
-#if (SHADOW_OPTIMIZATIONS && SHOPT_OUT_OF_SYNC )
+#if (SHADOW_OPTIMIZATIONS & SHOPT_OUT_OF_SYNC )
         if ( mfn_valid(sl2mfn) )
             shadow_resync_all(v, 0);
 #endif

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New release candidates for Xen 3.3.2 and 3.4.1
  2009-06-23 16:14 ` Frank van der Linden
@ 2009-06-23 17:24   ` Keir Fraser
  2009-06-23 17:43     ` Frank van der Linden
  0 siblings, 1 reply; 5+ messages in thread
From: Keir Fraser @ 2009-06-23 17:24 UTC (permalink / raw)
  To: Frank van der Linden; +Cc: xen-devel

On 23/06/2009 17:14, "Frank van der Linden" <Frank.Vanderlinden@Sun.COM>
wrote:

> We are putting our 3.3.2-rc3 based tree through some testing, and are
> seeing what looks like a shadow page table issue.

Is this a regression relative to 3.3.1, do you know? That would make it more
likely a blocker.

 -- Keir

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New release candidates for Xen 3.3.2 and 3.4.1
  2009-06-23 17:24   ` Keir Fraser
@ 2009-06-23 17:43     ` Frank van der Linden
  2009-06-23 19:03       ` Keir Fraser
  0 siblings, 1 reply; 5+ messages in thread
From: Frank van der Linden @ 2009-06-23 17:43 UTC (permalink / raw)
  To: Keir Fraser; +Cc: xen-devel

Keir Fraser wrote:
> On 23/06/2009 17:14, "Frank van der Linden" <Frank.Vanderlinden@Sun.COM>
> wrote:
>
>   
>> We are putting our 3.3.2-rc3 based tree through some testing, and are
>> seeing what looks like a shadow page table issue.
>>     
>
> Is this a regression relative to 3.3.1, do you know? That would make it more
> likely a blocker.
>
>  -- Keir
>
>   
I can't say for sure, unfortunately, since the tests that uncovered this 
weren't run on earlier incarnations of our 3.3-based tree.

It seems that a few shadow fixes have gone in post 3.3.1, but I have no 
evidence that these would cause this particular. problem.

- Frank

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New release candidates for Xen 3.3.2 and 3.4.1
  2009-06-23 17:43     ` Frank van der Linden
@ 2009-06-23 19:03       ` Keir Fraser
  0 siblings, 0 replies; 5+ messages in thread
From: Keir Fraser @ 2009-06-23 19:03 UTC (permalink / raw)
  To: Frank van der Linden; +Cc: xen-devel

On 23/06/2009 18:43, "Frank van der Linden" <Frank.Vanderlinden@Sun.COM>
wrote:

>> Is this a regression relative to 3.3.1, do you know? That would make it more
>> likely a blocker.
>> 
>>  -- Keir
>> 
> I can't say for sure, unfortunately, since the tests that uncovered this
> weren't run on earlier incarnations of our 3.3-based tree.
> 
> It seems that a few shadow fixes have gone in post 3.3.1, but I have no
> evidence that these would cause this particular. problem.

It might be sensible to try, depending on how long the bug takes to
reproduce. If you can be sure that it does not repro with 3.3.1 that has two
advantages for you: firstly, I would hold up the 3.3.2 release, and secondly
the set of changesets to search for the bug is greatly reduced.

 -- Keir

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-06-23 19:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-23 10:44 New release candidates for Xen 3.3.2 and 3.4.1 Keir Fraser
2009-06-23 16:14 ` Frank van der Linden
2009-06-23 17:24   ` Keir Fraser
2009-06-23 17:43     ` Frank van der Linden
2009-06-23 19:03       ` Keir Fraser

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.