All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] libxl: a domain can be dying but not shutdown
@ 2014-10-22 13:52 David Scott
  2014-10-22 14:02 ` Rob Hoes
  2014-10-22 14:08 ` Andrew Cooper
  0 siblings, 2 replies; 8+ messages in thread
From: David Scott @ 2014-10-22 13:52 UTC (permalink / raw)
  To: xen-devel
  Cc: David Scott, wei.liu2, ian.campbell, stefano.stabellini,
	ian.jackson, rob.hoes, euan.harris

The shutdown code is only present if the domain is shutdown.
If we attempt to extract it from the flags from a dying but not
shutdown domain then we get values like '255' which is not a
valid LIBXL_SHUTDOWN_REASON_. We should use LIBXL_SHUTDOWN_UNKNOWN
in this case.

Signed-off-by: David Scott <dave.scott@citrix.com>

---

This can be tested by running 2 domUs, and having one map pages from
the other. I used a vchan connection, so I had a vchan_server granting
pages and a vchan_client mapping them. I made sure the client is
never going to unmap the pages (I used 'sleep' in a Mirage kernel but
'xl pause' from outside would probably also work) and then I
'xl destroyed' the server. The server domain ends up stuck in the dying
state because the client still has a page mapped. The server domain
is not shutdown.

According to 'xl list':

djs@st20:~/djs55/list$ sudo xl list
Name                                        ID   Mem VCPUs      State   Time(s)
Domain-0                                     0  5278     6     r-----   10971.2
fedora                                      12  2048     1     -b----    5470.0
(null)                                      21     0     1     -bp--d      25.1
vchan_client                                22   256     1     -b----       0.0

and according to my test program which calls libxl_list_domain:

domain 0 shutdown = 0 dying = 0 shutdown_reason = -1
domain 12 shutdown = 0 dying = 0 shutdown_reason = -1
domain 21 shutdown = 0 dying = 1 shutdown_reason = 255
domain 22 shutdown = 0 dying = 0 shutdown_reason = -1

I believe this also manifests transiently during a normal 'xl destroy'.

Cheers,
Dave
---
 tools/libxl/libxl.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 9c72df2..a2a29b1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -600,7 +600,7 @@ static void xcinfo2xlinfo(libxl_ctx *ctx,
     xlinfo->blocked  = !!(xcinfo->flags&XEN_DOMINF_blocked);
     xlinfo->running  = !!(xcinfo->flags&XEN_DOMINF_running);
 
-    if (xlinfo->shutdown || xlinfo->dying)
+    if (xlinfo->shutdown)
         xlinfo->shutdown_reason = (xcinfo->flags>>XEN_DOMINF_shutdownshift) & XEN_DOMINF_shutdownmask;
     else
         xlinfo->shutdown_reason = LIBXL_SHUTDOWN_REASON_UNKNOWN;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-10-22 13:52 [PATCH] libxl: a domain can be dying but not shutdown David Scott
@ 2014-10-22 14:02 ` Rob Hoes
  2014-10-22 14:08 ` Andrew Cooper
  1 sibling, 0 replies; 8+ messages in thread
From: Rob Hoes @ 2014-10-22 14:02 UTC (permalink / raw)
  To: xen-devel
  Cc: Dave Scott, Wei Liu, Ian Campbell, Euan Harris,
	Stefano Stabellini, Ian Jackson

> The shutdown code is only present if the domain is shutdown.
> If we attempt to extract it from the flags from a dying but not shutdown
> domain then we get values like '255' which is not a valid
> LIBXL_SHUTDOWN_REASON_. We should use LIBXL_SHUTDOWN_UNKNOWN in this case.

Without this patch, basic XenServer tests failed intermittently due to the fact that the shutdown_reason had an invalid value (a value not in the enum defined in the IDL). The failure happened during a "hard shutdown", which translates to a  "destroy domain". With Dave's patch, the shutdown_reason became UNKNOWN in this case, and all was fine.

Acked-by: Rob Hoes <rob.hoes@citrix.com>

> Signed-off-by: David Scott <dave.scott@citrix.com>
> 
> ---
> 
> This can be tested by running 2 domUs, and having one map pages from the
> other. I used a vchan connection, so I had a vchan_server granting pages
> and a vchan_client mapping them. I made sure the client is never going to
> unmap the pages (I used 'sleep' in a Mirage kernel but 'xl pause' from
> outside would probably also work) and then I 'xl destroyed' the server.
> The server domain ends up stuck in the dying state because the client
> still has a page mapped. The server domain is not shutdown.
> 
> According to 'xl list':
> 
> djs@st20:~/djs55/list$ sudo xl list
> Name                                        ID   Mem VCPUs      State
> Time(s)
> Domain-0                                     0  5278     6     r-----
> 10971.2
> fedora                                      12  2048     1     -b----
> 5470.0
> (null)                                      21     0     1     -bp--d
> 25.1
> vchan_client                                22   256     1     -b----
> 0.0
> 
> and according to my test program which calls libxl_list_domain:
> 
> domain 0 shutdown = 0 dying = 0 shutdown_reason = -1 domain 12 shutdown =
> 0 dying = 0 shutdown_reason = -1 domain 21 shutdown = 0 dying = 1
> shutdown_reason = 255 domain 22 shutdown = 0 dying = 0 shutdown_reason = -
> 1
> 
> I believe this also manifests transiently during a normal 'xl destroy'.
> 
> Cheers,
> Dave
> ---
>  tools/libxl/libxl.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index
> 9c72df2..a2a29b1 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -600,7 +600,7 @@ static void xcinfo2xlinfo(libxl_ctx *ctx,
>      xlinfo->blocked  = !!(xcinfo->flags&XEN_DOMINF_blocked);
>      xlinfo->running  = !!(xcinfo->flags&XEN_DOMINF_running);
> 
> -    if (xlinfo->shutdown || xlinfo->dying)
> +    if (xlinfo->shutdown)
>          xlinfo->shutdown_reason = (xcinfo-
> >flags>>XEN_DOMINF_shutdownshift) & XEN_DOMINF_shutdownmask;
>      else
>          xlinfo->shutdown_reason = LIBXL_SHUTDOWN_REASON_UNKNOWN;
> --
> 1.7.10.4

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-10-22 13:52 [PATCH] libxl: a domain can be dying but not shutdown David Scott
  2014-10-22 14:02 ` Rob Hoes
@ 2014-10-22 14:08 ` Andrew Cooper
  2014-10-23  9:42   ` Ian Campbell
  1 sibling, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2014-10-22 14:08 UTC (permalink / raw)
  To: David Scott, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, ian.jackson,
	rob.hoes, euan.harris

On 22/10/14 14:52, David Scott wrote:
> The shutdown code is only present if the domain is shutdown.
> If we attempt to extract it from the flags from a dying but not
> shutdown domain then we get values like '255' which is not a
> valid LIBXL_SHUTDOWN_REASON_. We should use LIBXL_SHUTDOWN_UNKNOWN
> in this case.
>
> Signed-off-by: David Scott <dave.scott@citrix.com>
>
> ---
>
> This can be tested by running 2 domUs, and having one map pages from
> the other. I used a vchan connection, so I had a vchan_server granting
> pages and a vchan_client mapping them. I made sure the client is
> never going to unmap the pages (I used 'sleep' in a Mirage kernel but
> 'xl pause' from outside would probably also work) and then I
> 'xl destroyed' the server. The server domain ends up stuck in the dying
> state because the client still has a page mapped. The server domain
> is not shutdown.
>
> According to 'xl list':
>
> djs@st20:~/djs55/list$ sudo xl list
> Name                                        ID   Mem VCPUs      State   Time(s)
> Domain-0                                     0  5278     6     r-----   10971.2
> fedora                                      12  2048     1     -b----    5470.0
> (null)                                      21     0     1     -bp--d      25.1
> vchan_client                                22   256     1     -b----       0.0
>
> and according to my test program which calls libxl_list_domain:
>
> domain 0 shutdown = 0 dying = 0 shutdown_reason = -1
> domain 12 shutdown = 0 dying = 0 shutdown_reason = -1
> domain 21 shutdown = 0 dying = 1 shutdown_reason = 255
> domain 22 shutdown = 0 dying = 0 shutdown_reason = -1
>
> I believe this also manifests transiently during a normal 'xl destroy'.
>
> Cheers,
> Dave
> ---

Hmm.  Xen unconditionally sets the shutdown code in info->flags when
querying for dominfo.

d->shutdown_code defaults to -1 (which explains the 255, given the
mask), but only becomes valid once d->is_shutting_down gets set.

This equates to XEN_DOMINF_shutdown lower in the flags field, which is
translated to xlinfo->shutdown just ahead of the context below.

It is explicitly not valid for a dying domain, as a dying domain can be
dying for many reasons, few of which include a valid shutdown code.

Therefore, other bits of libxl.h could do with correcting, given the
buggy changeset 4d70c9c5

~Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-10-22 14:08 ` Andrew Cooper
@ 2014-10-23  9:42   ` Ian Campbell
  2014-10-23 10:00     ` Andrew Cooper
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-10-23  9:42 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: David Scott, wei.liu2, stefano.stabellini, ian.jackson, rob.hoes,
	euan.harris, xen-devel

On Wed, 2014-10-22 at 15:08 +0100, Andrew Cooper wrote:
> On 22/10/14 14:52, David Scott wrote:
> > The shutdown code is only present if the domain is shutdown.
> > If we attempt to extract it from the flags from a dying but not
> > shutdown domain then we get values like '255' which is not a
> > valid LIBXL_SHUTDOWN_REASON_. We should use LIBXL_SHUTDOWN_UNKNOWN
> > in this case.
> >
> > Signed-off-by: David Scott <dave.scott@citrix.com>
> >
> > ---
> >
> > This can be tested by running 2 domUs, and having one map pages from
> > the other. I used a vchan connection, so I had a vchan_server granting
> > pages and a vchan_client mapping them. I made sure the client is
> > never going to unmap the pages (I used 'sleep' in a Mirage kernel but
> > 'xl pause' from outside would probably also work) and then I
> > 'xl destroyed' the server. The server domain ends up stuck in the dying
> > state because the client still has a page mapped. The server domain
> > is not shutdown.
> >
> > According to 'xl list':
> >
> > djs@st20:~/djs55/list$ sudo xl list
> > Name                                        ID   Mem VCPUs      State   Time(s)
> > Domain-0                                     0  5278     6     r-----   10971.2
> > fedora                                      12  2048     1     -b----    5470.0
> > (null)                                      21     0     1     -bp--d      25.1
> > vchan_client                                22   256     1     -b----       0.0
> >
> > and according to my test program which calls libxl_list_domain:
> >
> > domain 0 shutdown = 0 dying = 0 shutdown_reason = -1
> > domain 12 shutdown = 0 dying = 0 shutdown_reason = -1
> > domain 21 shutdown = 0 dying = 1 shutdown_reason = 255
> > domain 22 shutdown = 0 dying = 0 shutdown_reason = -1
> >
> > I believe this also manifests transiently during a normal 'xl destroy'.
> >
> > Cheers,
> > Dave
> > ---
> 
> Hmm.  Xen unconditionally sets the shutdown code in info->flags when
> querying for dominfo.
> 
> d->shutdown_code defaults to -1 (which explains the 255, given the
> mask), but only becomes valid once d->is_shutting_down gets set.
> 
> This equates to XEN_DOMINF_shutdown lower in the flags field, which is
> translated to xlinfo->shutdown just ahead of the context below.
> 
> It is explicitly not valid for a dying domain, as a dying domain can be
> dying for many reasons, few of which include a valid shutdown code.

I'm afraid I can't tell whether this constitutes an implicit ack or a
nack of this patch.

> Therefore, other bits of libxl.h could do with correcting, given the
> buggy changeset 4d70c9c5

Do you know of any specific locations? I don't see any relevant uses of
the word "shutdown" or "reason" in libxl.h.

Did you instead mean this from libxl_types.idl:
    # Valid iff (shutdown||dying).
    #
    # Otherwise set to a value guaranteed not to clash with any valid
    # LIBXL_SHUTDOWN_REASON_* constant.
    ("shutdown_reason", libxl_shutdown_reason),
?

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-10-23  9:42   ` Ian Campbell
@ 2014-10-23 10:00     ` Andrew Cooper
  2014-11-04 10:48       ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Cooper @ 2014-10-23 10:00 UTC (permalink / raw)
  To: Ian Campbell
  Cc: David Scott, wei.liu2, stefano.stabellini, ian.jackson, rob.hoes,
	euan.harris, xen-devel

On 23/10/14 10:42, Ian Campbell wrote:
> On Wed, 2014-10-22 at 15:08 +0100, Andrew Cooper wrote:
>> On 22/10/14 14:52, David Scott wrote:
>>> The shutdown code is only present if the domain is shutdown.
>>> If we attempt to extract it from the flags from a dying but not
>>> shutdown domain then we get values like '255' which is not a
>>> valid LIBXL_SHUTDOWN_REASON_. We should use LIBXL_SHUTDOWN_UNKNOWN
>>> in this case.
>>>
>>> Signed-off-by: David Scott <dave.scott@citrix.com>
>>>
>>> ---
>>>
>>> This can be tested by running 2 domUs, and having one map pages from
>>> the other. I used a vchan connection, so I had a vchan_server granting
>>> pages and a vchan_client mapping them. I made sure the client is
>>> never going to unmap the pages (I used 'sleep' in a Mirage kernel but
>>> 'xl pause' from outside would probably also work) and then I
>>> 'xl destroyed' the server. The server domain ends up stuck in the dying
>>> state because the client still has a page mapped. The server domain
>>> is not shutdown.
>>>
>>> According to 'xl list':
>>>
>>> djs@st20:~/djs55/list$ sudo xl list
>>> Name                                        ID   Mem VCPUs      State   Time(s)
>>> Domain-0                                     0  5278     6     r-----   10971.2
>>> fedora                                      12  2048     1     -b----    5470.0
>>> (null)                                      21     0     1     -bp--d      25.1
>>> vchan_client                                22   256     1     -b----       0.0
>>>
>>> and according to my test program which calls libxl_list_domain:
>>>
>>> domain 0 shutdown = 0 dying = 0 shutdown_reason = -1
>>> domain 12 shutdown = 0 dying = 0 shutdown_reason = -1
>>> domain 21 shutdown = 0 dying = 1 shutdown_reason = 255
>>> domain 22 shutdown = 0 dying = 0 shutdown_reason = -1
>>>
>>> I believe this also manifests transiently during a normal 'xl destroy'.
>>>
>>> Cheers,
>>> Dave
>>> ---
>> Hmm.  Xen unconditionally sets the shutdown code in info->flags when
>> querying for dominfo.
>>
>> d->shutdown_code defaults to -1 (which explains the 255, given the
>> mask), but only becomes valid once d->is_shutting_down gets set.
>>
>> This equates to XEN_DOMINF_shutdown lower in the flags field, which is
>> translated to xlinfo->shutdown just ahead of the context below.
>>
>> It is explicitly not valid for a dying domain, as a dying domain can be
>> dying for many reasons, few of which include a valid shutdown code.
> I'm afraid I can't tell whether this constitutes an implicit ack or a
> nack of this patch.

Sorry.  The patch is good, but also wants to adjust some comments to
match reality.

>
>> Therefore, other bits of libxl.h could do with correcting, given the
>> buggy changeset 4d70c9c5
> Do you know of any specific locations? I don't see any relevant uses of
> the word "shutdown" or "reason" in libxl.h.
>
> Did you instead mean this from libxl_types.idl:
>     # Valid iff (shutdown||dying).
>     #
>     # Otherwise set to a value guaranteed not to clash with any valid
>     # LIBXL_SHUTDOWN_REASON_* constant.
>     ("shutdown_reason", libxl_shutdown_reason),
> ?

That is the primary comment I was on about.

~Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-10-23 10:00     ` Andrew Cooper
@ 2014-11-04 10:48       ` Ian Campbell
  2014-11-04 11:34         ` Dave Scott
  0 siblings, 1 reply; 8+ messages in thread
From: Ian Campbell @ 2014-11-04 10:48 UTC (permalink / raw)
  To: Andrew Cooper, Konrad Rzeszutek Wilk
  Cc: David Scott, wei.liu2, stefano.stabellini, ian.jackson, rob.hoes,
	euan.harris, xen-devel

On Thu, 2014-10-23 at 11:00 +0100, Andrew Cooper wrote:

> > Did you instead mean this from libxl_types.idl:
> >     # Valid iff (shutdown||dying).
> >     #
> >     # Otherwise set to a value guaranteed not to clash with any valid
> >     # LIBXL_SHUTDOWN_REASON_* constant.
> >     ("shutdown_reason", libxl_shutdown_reason),
> > ?
> 
> That is the primary comment I was on about.

This wasn't tagged for 4.5 but it is a bug fix and the commit log &
associated commentary were pretty convincing, so Acked + Applied with
the additional hunk below folded in.

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index ca3f724..f7fc695 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -260,7 +260,7 @@ libxl_dominfo = Struct("dominfo",[
     ("shutdown",    bool),
     ("dying",       bool),
 
-    # Valid iff (shutdown||dying).
+    # Valid iff ->shutdown is true.
     #
     # Otherwise set to a value guaranteed not to clash with any valid
     # LIBXL_SHUTDOWN_REASON_* constant.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-11-04 10:48       ` Ian Campbell
@ 2014-11-04 11:34         ` Dave Scott
  2014-11-04 11:35           ` Ian Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Dave Scott @ 2014-11-04 11:34 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Wei Liu, Euan Harris, Andrew Cooper, Rob Hoes,
	Stefano Stabellini, xen-devel, Ian Jackson


> On 4 Nov 2014, at 10:48, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> 
> On Thu, 2014-10-23 at 11:00 +0100, Andrew Cooper wrote:
> 
>>> Did you instead mean this from libxl_types.idl:
>>>    # Valid iff (shutdown||dying).
>>>    #
>>>    # Otherwise set to a value guaranteed not to clash with any valid
>>>    # LIBXL_SHUTDOWN_REASON_* constant.
>>>    ("shutdown_reason", libxl_shutdown_reason),
>>> ?
>> 
>> That is the primary comment I was on about.
> 
> This wasn't tagged for 4.5 but it is a bug fix and the commit log &
> associated commentary were pretty convincing, so Acked + Applied with
> the additional hunk below folded in.
> 
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index ca3f724..f7fc695 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -260,7 +260,7 @@ libxl_dominfo = Struct("dominfo",[
>     ("shutdown",    bool),
>     ("dying",       bool),
> 
> -    # Valid iff (shutdown||dying).
> +    # Valid iff ->shutdown is true.

Great, thanks for doing this (after I forgot!)

Cheers,
Dave

>     #
>     # Otherwise set to a value guaranteed not to clash with any valid
>     # LIBXL_SHUTDOWN_REASON_* constant.
> 
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] libxl: a domain can be dying but not shutdown
  2014-11-04 11:34         ` Dave Scott
@ 2014-11-04 11:35           ` Ian Campbell
  0 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2014-11-04 11:35 UTC (permalink / raw)
  To: Dave Scott
  Cc: Wei Liu, Euan Harris, Andrew Cooper, Rob Hoes,
	Stefano Stabellini, xen-devel, Ian Jackson

On Tue, 2014-11-04 at 11:34 +0000, Dave Scott wrote:
> > On 4 Nov 2014, at 10:48, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> > 
> > On Thu, 2014-10-23 at 11:00 +0100, Andrew Cooper wrote:
> > 
> >>> Did you instead mean this from libxl_types.idl:
> >>>    # Valid iff (shutdown||dying).
> >>>    #
> >>>    # Otherwise set to a value guaranteed not to clash with any valid
> >>>    # LIBXL_SHUTDOWN_REASON_* constant.
> >>>    ("shutdown_reason", libxl_shutdown_reason),
> >>> ?
> >> 
> >> That is the primary comment I was on about.
> > 
> > This wasn't tagged for 4.5 but it is a bug fix and the commit log &
> > associated commentary were pretty convincing, so Acked + Applied with
> > the additional hunk below folded in.
> > 
> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> > index ca3f724..f7fc695 100644
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -260,7 +260,7 @@ libxl_dominfo = Struct("dominfo",[
> >     ("shutdown",    bool),
> >     ("dying",       bool),
> > 
> > -    # Valid iff (shutdown||dying).
> > +    # Valid iff ->shutdown is true.
> 
> Great, thanks for doing this (after I forgot!)

It was easier than pinging ;-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-11-04 11:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-22 13:52 [PATCH] libxl: a domain can be dying but not shutdown David Scott
2014-10-22 14:02 ` Rob Hoes
2014-10-22 14:08 ` Andrew Cooper
2014-10-23  9:42   ` Ian Campbell
2014-10-23 10:00     ` Andrew Cooper
2014-11-04 10:48       ` Ian Campbell
2014-11-04 11:34         ` Dave Scott
2014-11-04 11:35           ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.