All of lore.kernel.org
 help / color / mirror / Atom feed
* Shadow domains left zombie
@ 2012-04-13 16:19 Andres Lagar-Cavilla
  2012-04-13 17:35 ` Gianluca Guida
  2012-04-19 17:08 ` [PATCH] " Tim Deegan
  0 siblings, 2 replies; 8+ messages in thread
From: Andres Lagar-Cavilla @ 2012-04-13 16:19 UTC (permalink / raw)
  To: xen-devel; +Cc: Gianluca Guida, tim

After a hvm+shadow domain dies (either clean shutdown or merciless
destroy), the domain is left in a zombie state with 1 (one) page left
dangling with a single reference.

(XEN) General information for domain 1:
(XEN)     refcnt=1 dying=2 pause_count=1
(XEN)     nr_pages=1 xenheap_pages=0 shared_pages=0 paged_pages=0
dirty_cpus={} max_pages=524544
(XEN)     handle=deadbeef-dead-beef-dead-beef00000001 vm_assist=00000000
(XEN)     paging assistance: shadow refcounts translate external
(XEN) Rangesets belonging to domain 1:
(XEN)     I/O Ports  { }
(XEN)     Interrupts { }
(XEN)     I/O Memory { }
(XEN) Memory pages belonging to domain 1:
(XEN)     DomPage 000000000010698e: caf=00000001, taf=7400000000000000
(XEN)     PoD entries=0 cachesize=0
(XEN) VCPU information and callbacks for domain 1:
(XEN)     VCPU0: CPU0 [has=F] poll=0 upcall_pend = 00, upcall_mask = 00
dirty_cpus={} cpu_affinity={0-3}
(XEN)     pause_count=1 pause_flags=0
(XEN)     paging assistance: shadowed 4-on-4
(XEN)     No periodic timer

If add a considerable amount of synchronous printk's, sometimes the domain
is not left zombie. There seems to be a race going on here. Due to the
type
information of the page, I believe this is a page that has been shadowed
with a writable map.

I verified the page is not any of the helper rings (qemu, buffered qemu,
store, console) that may get external writeable references.

This happens on win7 guest with or without pv drivers. It happens with or
without shadow optimizations (SHOPT defines). It happens with or without
synchronized p2m lookups (patches just posted).

Hopefully the shadow masters have a better understanding on how to proceed
from here on.

Thanks,
Andres

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Shadow domains left zombie
  2012-04-13 16:19 Shadow domains left zombie Andres Lagar-Cavilla
@ 2012-04-13 17:35 ` Gianluca Guida
  2012-04-13 17:38   ` Andres Lagar-Cavilla
  2012-04-19 17:08 ` [PATCH] " Tim Deegan
  1 sibling, 1 reply; 8+ messages in thread
From: Gianluca Guida @ 2012-04-13 17:35 UTC (permalink / raw)
  To: andres; +Cc: tim, xen-devel

On Fri, Apr 13, 2012 at 9:19 AM, Andres Lagar-Cavilla
<andres@lagarcavilla.org> wrote:
> After a hvm+shadow domain dies (either clean shutdown or merciless
> destroy), the domain is left in a zombie state with 1 (one) page left
> dangling with a single reference.

[...]

> (XEN)     paging assistance: shadowed 4-on-4

[...]

> This happens on win7 guest with or without pv drivers. It happens with or
> without shadow optimizations (SHOPT defines). It happens with or without
> synchronized p2m lookups (patches just posted).

Does it happens only in 64bit guests?

Thanks,
Gianluca

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Shadow domains left zombie
  2012-04-13 17:35 ` Gianluca Guida
@ 2012-04-13 17:38   ` Andres Lagar-Cavilla
  0 siblings, 0 replies; 8+ messages in thread
From: Andres Lagar-Cavilla @ 2012-04-13 17:38 UTC (permalink / raw)
  To: Gianluca Guida; +Cc: tim, xen-devel

> On Fri, Apr 13, 2012 at 9:19 AM, Andres Lagar-Cavilla
> <andres@lagarcavilla.org> wrote:
>> After a hvm+shadow domain dies (either clean shutdown or merciless
>> destroy), the domain is left in a zombie state with 1 (one) page left
>> dangling with a single reference.
>
> [...]
>
>> (XEN)     paging assistance: shadowed 4-on-4
>
> [...]
>
>> This happens on win7 guest with or without pv drivers. It happens with
>> or
>> without shadow optimizations (SHOPT defines). It happens with or without
>> synchronized p2m lookups (patches just posted).
>
> Does it happens only in 64bit guests?
Haven't tried 32 bit (w/ wo/ PAE) guests.
Andres

>
> Thanks,
> Gianluca
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] Re:  Shadow domains left zombie
  2012-04-13 16:19 Shadow domains left zombie Andres Lagar-Cavilla
  2012-04-13 17:35 ` Gianluca Guida
@ 2012-04-19 17:08 ` Tim Deegan
  2012-04-19 20:10   ` Andres Lagar-Cavilla
                     ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Tim Deegan @ 2012-04-19 17:08 UTC (permalink / raw)
  To: Andres Lagar-Cavilla; +Cc: Gianluca Guida, Jan Beulich, xen-devel

[-- Attachment #1: Type: text/plain, Size: 667 bytes --]

At 09:19 -0700 on 13 Apr (1334308772), Andres Lagar-Cavilla wrote:
> After a hvm+shadow domain dies (either clean shutdown or merciless
> destroy), the domain is left in a zombie state with 1 (one) page left
> dangling with a single reference.

The reference is to the top-level pagetable that was pointed to by CR3
when the domain was killed.  This bug came in with:

 changeset:   23142:f5e8d152a565
 user:        Jan Beulich <jbeulich@novell.com>
 date:        Tue Apr 05 13:01:25 2011 +0100
 description: x86: split struct vcpu

where HVM domains no longer have vcpu_destroy_pagetables(v) called on
their VCPUs as they die.  Proposed fix attached.

Cheers,

Tim.

[-- Attachment #2: guest-table-ref --]
[-- Type: text/plain, Size: 1083 bytes --]

x86: restore vcpu_destroy_pagetables() call on HVM domain teardown.

HVM vcpus that are using shadow pagetables have valid guest_table fields,
which need to be tidied up on domain teardown.

Signed-off-by: Tim Deegan <tim@xen.org>

diff -r 29e4f8cefc5a -r e67b344afe8e xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Thu Apr 19 15:48:30 2012 +0100
+++ b/xen/arch/x86/domain.c	Thu Apr 19 18:04:29 2012 +0100
@@ -2105,13 +2105,14 @@ int domain_relinquish_resources(struct d
         /* Tear down paging-assistance stuff. */
         paging_teardown(d);
 
+        /* Drop the in-use references to page-table bases. */
+        for_each_vcpu ( d, v )
+            vcpu_destroy_pagetables(v);
+
         if ( !is_hvm_domain(d) )
         {
             for_each_vcpu ( d, v )
             {
-                /* Drop the in-use references to page-table bases. */
-                vcpu_destroy_pagetables(v);
-
                 /*
                  * Relinquish GDT mappings. No need for explicit unmapping of
                  * the LDT as it automatically gets squashed with the guest

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re:  Shadow domains left zombie
  2012-04-19 17:08 ` [PATCH] " Tim Deegan
@ 2012-04-19 20:10   ` Andres Lagar-Cavilla
  2012-04-20  7:59   ` Jan Beulich
  2012-04-20  9:26   ` Ian Campbell
  2 siblings, 0 replies; 8+ messages in thread
From: Andres Lagar-Cavilla @ 2012-04-19 20:10 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Gianluca Guida, Jan Beulich, xen-devel

> At 09:19 -0700 on 13 Apr (1334308772), Andres Lagar-Cavilla wrote:
>> After a hvm+shadow domain dies (either clean shutdown or merciless
>> destroy), the domain is left in a zombie state with 1 (one) page left
>> dangling with a single reference.
>
> The reference is to the top-level pagetable that was pointed to by CR3
> when the domain was killed.  This bug came in with:
>
>  changeset:   23142:f5e8d152a565
>  user:        Jan Beulich <jbeulich@novell.com>
>  date:        Tue Apr 05 13:01:25 2011 +0100
>  description: x86: split struct vcpu
>
> where HVM domains no longer have vcpu_destroy_pagetables(v) called on
> their VCPUs as they die.  Proposed fix attached.
>
> Cheers,

Looks good. Thanks for tracing that down. Ack from my end.
Thanks
Andres

>
> Tim.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] Re:  Shadow domains left zombie
  2012-04-19 17:08 ` [PATCH] " Tim Deegan
  2012-04-19 20:10   ` Andres Lagar-Cavilla
@ 2012-04-20  7:59   ` Jan Beulich
  2012-04-20  8:51     ` Tim Deegan
  2012-04-20  9:26   ` Ian Campbell
  2 siblings, 1 reply; 8+ messages in thread
From: Jan Beulich @ 2012-04-20  7:59 UTC (permalink / raw)
  To: Andres Lagar-Cavilla, Tim Deegan; +Cc: Gianluca Guida, xen-devel

>>> On 19.04.12 at 19:08, Tim Deegan <tim@xen.org> wrote:
> At 09:19 -0700 on 13 Apr (1334308772), Andres Lagar-Cavilla wrote:
>> After a hvm+shadow domain dies (either clean shutdown or merciless
>> destroy), the domain is left in a zombie state with 1 (one) page left
>> dangling with a single reference.
> 
> The reference is to the top-level pagetable that was pointed to by CR3
> when the domain was killed.  This bug came in with:
> 
>  changeset:   23142:f5e8d152a565
>  user:        Jan Beulich <jbeulich@novell.com>
>  date:        Tue Apr 05 13:01:25 2011 +0100
>  description: x86: split struct vcpu
> 
> where HVM domains no longer have vcpu_destroy_pagetables(v) called on
> their VCPUs as they die.  Proposed fix attached.

Acked-by: Jan Beulich <jbeulich@suse.com>

I'm sorry for that. Given that this had been quite some time back, I
can only guess that I got misguided by the fact that
arch_vcpu_reset() calls this for PV only (legitimately, i.e. not causing
any leak).

Jan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re:  Shadow domains left zombie
  2012-04-20  7:59   ` Jan Beulich
@ 2012-04-20  8:51     ` Tim Deegan
  0 siblings, 0 replies; 8+ messages in thread
From: Tim Deegan @ 2012-04-20  8:51 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Gianluca Guida, Andres Lagar-Cavilla, xen-devel

At 08:59 +0100 on 20 Apr (1334912349), Jan Beulich wrote:
> >>> On 19.04.12 at 19:08, Tim Deegan <tim@xen.org> wrote:
> > At 09:19 -0700 on 13 Apr (1334308772), Andres Lagar-Cavilla wrote:
> >> After a hvm+shadow domain dies (either clean shutdown or merciless
> >> destroy), the domain is left in a zombie state with 1 (one) page left
> >> dangling with a single reference.
> > 
> > The reference is to the top-level pagetable that was pointed to by CR3
> > when the domain was killed.  This bug came in with:
> > 
> >  changeset:   23142:f5e8d152a565
> >  user:        Jan Beulich <jbeulich@novell.com>
> >  date:        Tue Apr 05 13:01:25 2011 +0100
> >  description: x86: split struct vcpu
> > 
> > where HVM domains no longer have vcpu_destroy_pagetables(v) called on
> > their VCPUs as they die.  Proposed fix attached.
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> I'm sorry for that. Given that this had been quite some time back, I
> can only guess that I got misguided by the fact that
> arch_vcpu_reset() calls this for PV only (legitimately, i.e. not causing
> any leak).

Yeah, it's not exactly clear.  Maybe after 4.2 I'll look at making it
more uniform.

Cheers,

Tim.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Re:  Shadow domains left zombie
  2012-04-19 17:08 ` [PATCH] " Tim Deegan
  2012-04-19 20:10   ` Andres Lagar-Cavilla
  2012-04-20  7:59   ` Jan Beulich
@ 2012-04-20  9:26   ` Ian Campbell
  2 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2012-04-20  9:26 UTC (permalink / raw)
  To: Tim Deegan; +Cc: Gianluca Guida, Andres Lagar-Cavilla, Jan Beulich, xen-devel

On Thu, 2012-04-19 at 18:08 +0100, Tim Deegan wrote:
> At 09:19 -0700 on 13 Apr (1334308772), Andres Lagar-Cavilla wrote:
> > After a hvm+shadow domain dies (either clean shutdown or merciless
> > destroy), the domain is left in a zombie state with 1 (one) page left
> > dangling with a single reference.
> 
> The reference is to the top-level pagetable that was pointed to by CR3
> when the domain was killed.  This bug came in with:
> 
>  changeset:   23142:f5e8d152a565
>  user:        Jan Beulich <jbeulich@novell.com>
>  date:        Tue Apr 05 13:01:25 2011 +0100
>  description: x86: split struct vcpu
> 
> where HVM domains no longer have vcpu_destroy_pagetables(v) called on
> their VCPUs as they die.  Proposed fix attached.

FTR this fixes the zombie domain issue I'd been seeing, AFAICT.

Thanks!

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-04-20  9:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-13 16:19 Shadow domains left zombie Andres Lagar-Cavilla
2012-04-13 17:35 ` Gianluca Guida
2012-04-13 17:38   ` Andres Lagar-Cavilla
2012-04-19 17:08 ` [PATCH] " Tim Deegan
2012-04-19 20:10   ` Andres Lagar-Cavilla
2012-04-20  7:59   ` Jan Beulich
2012-04-20  8:51     ` Tim Deegan
2012-04-20  9:26   ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.