All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Revert xc cpupool retries and document anomaly
@ 2016-04-14 17:07 Ian Jackson
  2016-04-14 17:07 ` [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case" Ian Jackson
                   ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Ian Jackson @ 2016-04-14 17:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Dario Faggioli,
	Tim Deegan, Jan Beulich

This small series is part of some cleanup for the CPUPOOL RMCPU EBUSY
problem.

The first patch in this series, a libxc patch, reverts what I think is
simply a mistake.  The second is trivial adds some annotations for the
benefit of the hypercall API HTML docs generator.  The third patch
provides an attempt at documenting the RMCPU EBUSY problem.


It is disappointing that such a strange and hard-to-use interface
should be committed to the hypervisor.  A contributing factor seem to
be a failure to realise that the behaviour should be documented.  An
proper attempt to document it would probably have resulted in
improvements to the interface.

I think that (unless suppositions not marked with `xxx' are false)
this doc comment is an improvement over leaving the anomaly totally
undocumented.  I therefore think that this patch is appropriate to
commit right away (!)

Followups, or review comments, which provide more accurate or more
precise wording, would of course be welcome.

I also think that there is room for improvement in the presented
hypercall semantics, as I understand them.


I would have liked to submit a fourth patch which made xl print the
unhelpful message about trying to re-add and re-remove the cpu, only
when it was relevant.

Unfortunately because the hypervisor doesn't seem to distinguish the
troublesome cases, it is not possible for libxl to tell the
difference.  So it is not possible for libxl to return a different
error code which xl could use to tell the difference.

I would appreciate suggestions from hypervisor maintainers as to how
this could be improved.


Finally, sorry for not spotting these problems earlier.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case"
  2016-04-14 17:07 [PATCH 0/3] Revert xc cpupool retries and document anomaly Ian Jackson
@ 2016-04-14 17:07 ` Ian Jackson
  2016-04-15  5:15   ` Juergen Gross
  2016-04-14 17:07 ` [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op Ian Jackson
  2016-04-14 17:07 ` [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result Ian Jackson
  2 siblings, 1 reply; 25+ messages in thread
From: Ian Jackson @ 2016-04-14 17:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Ian Jackson,
	Dario Faggioli, Tim Deegan, Jan Beulich

libxc may be called from within long-running daemons such as libvirt.

In such a system this sleep would enable an uncooperative or buggy
guest to block all toolstack operations for an extended period.

Sadly, therefore, such a retry loop is not feasible without a lot of
engineering which is probably not appropriate.

This reverts commit 1ef6beea187bca8d11152b6c7d987b2b9450f936
"libxc: do some retries in xc_cpupool_removecpu() for EBUSY case"

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Dario Faggioli <dario.faggioli@citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/xc_cpupool.c |   20 +-------------------
 1 file changed, 1 insertion(+), 19 deletions(-)

diff --git a/tools/libxc/xc_cpupool.c b/tools/libxc/xc_cpupool.c
index 261b9c9..c42273e 100644
--- a/tools/libxc/xc_cpupool.c
+++ b/tools/libxc/xc_cpupool.c
@@ -20,7 +20,6 @@
  */
 
 #include <stdarg.h>
-#include <unistd.h>
 #include "xc_private.h"
 
 static int do_sysctl_save(xc_interface *xch, struct xen_sysctl *sysctl)
@@ -138,34 +137,17 @@ int xc_cpupool_addcpu(xc_interface *xch,
     return do_sysctl_save(xch, &sysctl);
 }
 
-/*
- * The hypervisor might return EBUSY when trying to remove a cpu from a
- * cpupool when a domain running in this cpupool has pinned a vcpu
- * temporarily. Do some retries in this case, perhaps the situation
- * cleans up.
- */
-#define NUM_RMCPU_BUSY_RETRIES 5
-
 int xc_cpupool_removecpu(xc_interface *xch,
                          uint32_t poolid,
                          int cpu)
 {
-    unsigned retries;
-    int err;
     DECLARE_SYSCTL;
 
     sysctl.cmd = XEN_SYSCTL_cpupool_op;
     sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_RMCPU;
     sysctl.u.cpupool_op.cpupool_id = poolid;
     sysctl.u.cpupool_op.cpu = (cpu < 0) ? XEN_SYSCTL_CPUPOOL_PAR_ANY : cpu;
-    for ( retries = 0; retries < NUM_RMCPU_BUSY_RETRIES; retries++ ) {
-        err = do_sysctl_save(xch, &sysctl);
-        if ( err < 0 && errno == EBUSY )
-            sleep(1);
-        else
-            break;
-    }
-    return err;
+    return do_sysctl_save(xch, &sysctl);
 }
 
 int xc_cpupool_movedomain(xc_interface *xch,
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op
  2016-04-14 17:07 [PATCH 0/3] Revert xc cpupool retries and document anomaly Ian Jackson
  2016-04-14 17:07 ` [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case" Ian Jackson
@ 2016-04-14 17:07 ` Ian Jackson
  2016-04-14 20:37   ` Dario Faggioli
  2016-04-14 17:07 ` [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result Ian Jackson
  2 siblings, 1 reply; 25+ messages in thread
From: Ian Jackson @ 2016-04-14 17:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Ian Jackson,
	Dario Faggioli, Tim Deegan, Jan Beulich

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Tim Deegan <tim@xen.org>
---
 xen/include/public/sysctl.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 4596d20..0849908 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -538,7 +538,7 @@ struct xen_sysctl_numainfo {
 typedef struct xen_sysctl_numainfo xen_sysctl_numainfo_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_numainfo_t);
 
-/* XEN_SYSCTL_cpupool_op */
+/* ` enum XEN_SYSCTL_cpupool_op { */
 #define XEN_SYSCTL_CPUPOOL_OP_CREATE                1  /* C */
 #define XEN_SYSCTL_CPUPOOL_OP_DESTROY               2  /* D */
 #define XEN_SYSCTL_CPUPOOL_OP_INFO                  3  /* I */
@@ -546,9 +546,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_numainfo_t);
 #define XEN_SYSCTL_CPUPOOL_OP_RMCPU                 5  /* R */
 #define XEN_SYSCTL_CPUPOOL_OP_MOVEDOMAIN            6  /* M */
 #define XEN_SYSCTL_CPUPOOL_OP_FREEINFO              7  /* F */
+/* ` } */
 #define XEN_SYSCTL_CPUPOOL_PAR_ANY     0xFFFFFFFF
 struct xen_sysctl_cpupool_op {
-    uint32_t op;          /* IN */
+    uint32_t op;          /* IN ` enum XEN_SYSCTL_cpupool_op ` */
     uint32_t cpupool_id;  /* IN: CDIARM OUT: CI */
     uint32_t sched_id;    /* IN: C      OUT: I  */
     uint32_t domid;       /* IN: M              */
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-14 17:07 [PATCH 0/3] Revert xc cpupool retries and document anomaly Ian Jackson
  2016-04-14 17:07 ` [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case" Ian Jackson
  2016-04-14 17:07 ` [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op Ian Jackson
@ 2016-04-14 17:07 ` Ian Jackson
  2016-04-14 17:24   ` Andrew Cooper
  2016-04-15  5:35   ` Juergen Gross
  2 siblings, 2 replies; 25+ messages in thread
From: Ian Jackson @ 2016-04-14 17:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Ian Jackson,
	Dario Faggioli, Tim Deegan, Jan Beulich

This is my attempt at understanding the situation, from reading
descriptions provided on list in the context of toolstack patches
which were attempting to work around the anomaly.

The multiple `xxx' entries reflect 1. my lack of complete understanding
2. API defects which I think I have identified.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
CC: Dario Faggioli <dario.faggioli@citrix.com>
CC: Juergen Gross <jgross@suse.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/include/public/sysctl.h |   28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 0849908..cfccf38 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -560,6 +560,34 @@ struct xen_sysctl_cpupool_op {
 typedef struct xen_sysctl_cpupool_op xen_sysctl_cpupool_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpupool_op_t);
 
+/*
+ * cpupool operations may return EBUSY if the operation cannot be
+ * executed right now because of another cpupool operation which is
+ * still in progress.  In this case, EBUSY means that the failed
+ * operation had no effect.
+ *
+ * Some operations including at least RMCPU (xxx which others?) may
+ * also return EBUSY because a guest has temporarily pinned one of its
+ * vcpus to the pcpu in question.  It is the pious hope (xxx) of the
+ * author of this comment that this can only occur for domains which
+ * have been granted some kind of hardware privilege (eg passthrough).
+ *
+ * In this case the operation may have been partially carried out and
+ * the pcpu is left in an anomalous state.  In this state the pcpu may
+ * be used by some not readily predictable subset of the vcpus
+ * (domains) whose vcpus are in the old cpupool.  (xxx is this true?)
+ *
+ * This can be detected by seeing whether the pcpu can be added to a
+ * different cpupool.  (xxx this is a silly interface; the situation
+ * should be reported by a different errno value, at least.)  If the
+ * pcpu can't be added to a different cpupool for this reason,
+ * attempts to do so will returning (xxx what errno value?)
+ *
+ * The anomalous situation can be recovered by adding the pcpu back to
+ * the cpupool it came from (xxx this permits a buggy or malicious
+ * guest to prevent the cpu ever being removed from its cpupool).
+ */ 
+
 #define ARINC653_MAX_DOMAINS_PER_SCHEDULE   64
 /*
  * This structure is used to pass a new ARINC653 schedule from a
-- 
1.7.10.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-14 17:07 ` [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result Ian Jackson
@ 2016-04-14 17:24   ` Andrew Cooper
  2016-04-14 17:56     ` Ian Jackson
  2016-04-15  5:35   ` Juergen Gross
  1 sibling, 1 reply; 25+ messages in thread
From: Andrew Cooper @ 2016-04-14 17:24 UTC (permalink / raw)
  To: Ian Jackson, xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Dario Faggioli,
	Tim Deegan, Jan Beulich

On 14/04/16 18:07, Ian Jackson wrote:
> This is my attempt at understanding the situation, from reading
> descriptions provided on list in the context of toolstack patches
> which were attempting to work around the anomaly.
>
> The multiple `xxx' entries reflect 1. my lack of complete understanding
> 2. API defects which I think I have identified.
>
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> CC: Dario Faggioli <dario.faggioli@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  xen/include/public/sysctl.h |   28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
>
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 0849908..cfccf38 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -560,6 +560,34 @@ struct xen_sysctl_cpupool_op {
>  typedef struct xen_sysctl_cpupool_op xen_sysctl_cpupool_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpupool_op_t);
>  
> +/*
> + * cpupool operations may return EBUSY if the operation cannot be
> + * executed right now because of another cpupool operation which is
> + * still in progress.  In this case, EBUSY means that the failed
> + * operation had no effect.
> + *
> + * Some operations including at least RMCPU (xxx which others?) may
> + * also return EBUSY because a guest has temporarily pinned one of its
> + * vcpus to the pcpu in question.  It is the pious hope (xxx) of the
> + * author of this comment that this can only occur for domains which
> + * have been granted some kind of hardware privilege (eg passthrough).

Any VM can be given any arbitrary pinning in its xl configuration file. 
Any arbitrary pinning can be applied at runtime via `xl vcpu-pin ...`

(To the best of my knowledge) A VM cannot choose pinning of its own
accord.  (i.e. the host admin has to choose the pinning.)

~Andrew

> + *
> + * In this case the operation may have been partially carried out and
> + * the pcpu is left in an anomalous state.  In this state the pcpu may
> + * be used by some not readily predictable subset of the vcpus
> + * (domains) whose vcpus are in the old cpupool.  (xxx is this true?)
> + *
> + * This can be detected by seeing whether the pcpu can be added to a
> + * different cpupool.  (xxx this is a silly interface; the situation
> + * should be reported by a different errno value, at least.)  If the
> + * pcpu can't be added to a different cpupool for this reason,
> + * attempts to do so will returning (xxx what errno value?)
> + *
> + * The anomalous situation can be recovered by adding the pcpu back to
> + * the cpupool it came from (xxx this permits a buggy or malicious
> + * guest to prevent the cpu ever being removed from its cpupool).
> + */ 
> +
>  #define ARINC653_MAX_DOMAINS_PER_SCHEDULE   64
>  /*
>   * This structure is used to pass a new ARINC653 schedule from a


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-14 17:24   ` Andrew Cooper
@ 2016-04-14 17:56     ` Ian Jackson
  2016-04-14 20:22       ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Ian Jackson @ 2016-04-14 17:56 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, xen-devel, Wei Liu, George Dunlap, Dario Faggioli,
	Tim Deegan, Jan Beulich

Andrew Cooper writes ("Re: [Xen-devel] [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> On 14/04/16 18:07, Ian Jackson wrote:
> > +/*
> > + * cpupool operations may return EBUSY if the operation cannot be
> > + * executed right now because of another cpupool operation which is
> > + * still in progress.  In this case, EBUSY means that the failed
> > + * operation had no effect.
> > + *
> > + * Some operations including at least RMCPU (xxx which others?) may
> > + * also return EBUSY because a guest has temporarily pinned one of its
> > + * vcpus to the pcpu in question.  It is the pious hope (xxx) of the
> > + * author of this comment that this can only occur for domains which
> > + * have been granted some kind of hardware privilege (eg passthrough).
> 
> Any VM can be given any arbitrary pinning in its xl configuration file. 
> Any arbitrary pinning can be applied at runtime via `xl vcpu-pin ...`

Does that produce EBUSY as well ?

The reuse of the same error number for all of

  "the existing configuration (eg toolstack-selected vcpu pinning)
   means that the operation does not make sense"

  "there is some lock contention and trying again may help"

  "a semantically conflicting, or nearly-semantically-conflicting,
   operation is currently in progress"

  "the guest has done a temporary pin which prevents this operation"

is very unfortunate.  How is a toolstack to know what to do ?

> (To the best of my knowledge) A VM cannot choose pinning of its own
> accord.  (i.e. the host admin has to choose the pinning.)

AIUI, that is not (now) true.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-14 17:56     ` Ian Jackson
@ 2016-04-14 20:22       ` Dario Faggioli
  2016-04-15 10:20         ` Ian Jackson
  0 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2016-04-14 20:22 UTC (permalink / raw)
  To: Ian Jackson, Andrew Cooper
  Cc: Juergen Gross, xen-devel, Wei Liu, George Dunlap, Tim Deegan,
	Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 3417 bytes --]

On Thu, 2016-04-14 at 18:56 +0100, Ian Jackson wrote:
> Andrew Cooper writes ("Re: [Xen-devel] [PATCH 3/3] xen: Document
> XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> > 
> > On 14/04/16 18:07, Ian Jackson wrote:
> > > 
> > > +/*
> > > + * cpupool operations may return EBUSY if the operation cannot
> > > be
> > > + * executed right now because of another cpupool operation which
> > > is
> > > + * still in progress.  In this case, EBUSY means that the failed
> > > + * operation had no effect.
> > > + *
> > > + * Some operations including at least RMCPU (xxx which others?)
> > > may
> > > + * also return EBUSY because a guest has temporarily pinned one
> > > of its
> > > + * vcpus to the pcpu in question.  It is the pious hope (xxx) of
> > > the
> > > + * author of this comment that this can only occur for domains
> > > which
> > > + * have been granted some kind of hardware privilege (eg
> > > passthrough).
> > Any VM can be given any arbitrary pinning in its xl configuration
> > file. 
> > Any arbitrary pinning can be applied at runtime via `xl vcpu-pin
> > ...`
> Does that produce EBUSY as well ?
> 
It can, after Juergen series, but I think in this case (setting
affinity), the situation is still acceptable. In fact:

> The reuse of the same error number for all of
> 
>   "the existing configuration (eg toolstack-selected vcpu pinning)
>    means that the operation does not make sense"
> 
This return -EINVAL.

>   "there is some lock contention and trying again may help"
> 
This can't happen in this case (and reason is just that setting the
affinity of a vcpu is different and less problematic than removing a
cpu from a cpupool).

>   "a semantically conflicting, or nearly-semantically-conflicting,
>    operation is currently in progress"
> 
I'm not sure what this means exactly, but I think that --depending on
what it exactly means-- it either can't happen or fall into the -EINVAL
case.

>   "the guest has done a temporary pin which prevents this operation"
> 
This (because of the series) returns -EBUSY.

> is very unfortunate.  How is a toolstack to know what to do ?
> 
Yeah, I agree, but again, I think in this case it's possible for
toolstack to tell.

From a quick check, we do not, in libxl, output any specific error
message in case we get -EBUSY... but I can send a patch to that effect
pretty quickly, if that's deemed necessary.

> > (To the best of my knowledge) A VM cannot choose pinning of its own
> > accord.  (i.e. the host admin has to choose the pinning.)
> AIUI, that is not (now) true.
> 
Yes, now a guest can call the new SCHEDOP_pin_override hypercall (and
Juergen is pushing a series to Linux for it to be able to do that... as
that was the purpose of the while thing!).

However, as said in another email, there's already a check like this in
place, in the implementation of such an hypercall:

        ret = -EPERM;
        if ( !is_hardware_domain(current->domain) )
            break;

which I think satisfies Ian's (legitimate) concern?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op
  2016-04-14 17:07 ` [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op Ian Jackson
@ 2016-04-14 20:37   ` Dario Faggioli
  2016-04-15 10:27     ` Ian Jackson
  0 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2016-04-14 20:37 UTC (permalink / raw)
  To: Ian Jackson, xen-devel
  Cc: Juergen Gross, Wei Liu, George Dunlap, Tim Deegan, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1751 bytes --]

On Thu, 2016-04-14 at 18:07 +0100, Ian Jackson wrote:
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Tim Deegan <tim@xen.org>
>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>

One thing, out of curiosity. This syntax, here:

> -/* XEN_SYSCTL_cpupool_op */
> +/* ` enum XEN_SYSCTL_cpupool_op { */
>  #define XEN_SYSCTL_CPUPOOL_OP_CREATE                1  /* C */
>  #define XEN_SYSCTL_CPUPOOL_OP_DESTROY               2  /* D */
>  #define XEN_SYSCTL_CPUPOOL_OP_INFO                  3  /* I */
> @@ -546,9 +546,10 @@ DEFINE_XEN_GUEST_HANDLE(xen_sysctl_numainfo_t);
>  #define XEN_SYSCTL_CPUPOOL_OP_RMCPU                 5  /* R */
>  #define XEN_SYSCTL_CPUPOOL_OP_MOVEDOMAIN            6  /* M */
>  #define XEN_SYSCTL_CPUPOOL_OP_FREEINFO              7  /* F */
> +/* ` } */
>  #define XEN_SYSCTL_CPUPOOL_PAR_ANY     0xFFFFFFFF
>  struct xen_sysctl_cpupool_op {
> -    uint32_t op;          /* IN */
> +    uint32_t op;          /* IN ` enum XEN_SYSCTL_cpupool_op ` */
>
and here, is I guess useful to the hypercall HTML docs generator, as
mentioned in the cover letter?

It is not something we do in many other places (if at all, at least in
this file)... If it is, I'll happily add to my TODO list to convert
more entries to it.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case"
  2016-04-14 17:07 ` [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case" Ian Jackson
@ 2016-04-15  5:15   ` Juergen Gross
  2016-04-15  7:46     ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2016-04-15  5:15 UTC (permalink / raw)
  To: Ian Jackson, xen-devel
  Cc: Wei Liu, George Dunlap, Dario Faggioli, Tim Deegan, Jan Beulich

On 14/04/16 19:07, Ian Jackson wrote:
> libxc may be called from within long-running daemons such as libvirt.
> 
> In such a system this sleep would enable an uncooperative or buggy
> guest to block all toolstack operations for an extended period.
> 
> Sadly, therefore, such a retry loop is not feasible without a lot of
> engineering which is probably not appropriate.

I understand your concerns. OTOH you should consider that libvirt has
no support for cpupool operations today, so it won't run this code.

> This reverts commit 1ef6beea187bca8d11152b6c7d987b2b9450f936
> "libxc: do some retries in xc_cpupool_removecpu() for EBUSY case"
> 
> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
> CC: Dario Faggioli <dario.faggioli@citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxc/xc_cpupool.c |   20 +-------------------
>  1 file changed, 1 insertion(+), 19 deletions(-)
> 
> diff --git a/tools/libxc/xc_cpupool.c b/tools/libxc/xc_cpupool.c
> index 261b9c9..c42273e 100644
> --- a/tools/libxc/xc_cpupool.c
> +++ b/tools/libxc/xc_cpupool.c
> @@ -20,7 +20,6 @@
>   */
>  
>  #include <stdarg.h>
> -#include <unistd.h>
>  #include "xc_private.h"
>  
>  static int do_sysctl_save(xc_interface *xch, struct xen_sysctl *sysctl)
> @@ -138,34 +137,17 @@ int xc_cpupool_addcpu(xc_interface *xch,
>      return do_sysctl_save(xch, &sysctl);
>  }
>  
> -/*
> - * The hypervisor might return EBUSY when trying to remove a cpu from a
> - * cpupool when a domain running in this cpupool has pinned a vcpu
> - * temporarily. Do some retries in this case, perhaps the situation
> - * cleans up.
> - */
> -#define NUM_RMCPU_BUSY_RETRIES 5
> -
>  int xc_cpupool_removecpu(xc_interface *xch,
>                           uint32_t poolid,
>                           int cpu)
>  {
> -    unsigned retries;
> -    int err;
>      DECLARE_SYSCTL;
>  
>      sysctl.cmd = XEN_SYSCTL_cpupool_op;
>      sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_RMCPU;
>      sysctl.u.cpupool_op.cpupool_id = poolid;
>      sysctl.u.cpupool_op.cpu = (cpu < 0) ? XEN_SYSCTL_CPUPOOL_PAR_ANY : cpu;
> -    for ( retries = 0; retries < NUM_RMCPU_BUSY_RETRIES; retries++ ) {
> -        err = do_sysctl_save(xch, &sysctl);
> -        if ( err < 0 && errno == EBUSY )
> -            sleep(1);

I'd rather just remove this sleep() call than the whole retry logic.
EBUSY cases should be very very very very rare and last for some
msecs or usecs only.


Juergen

> -        else
> -            break;
> -    }
> -    return err;
> +    return do_sysctl_save(xch, &sysctl);
>  }
>  
>  int xc_cpupool_movedomain(xc_interface *xch,
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-14 17:07 ` [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result Ian Jackson
  2016-04-14 17:24   ` Andrew Cooper
@ 2016-04-15  5:35   ` Juergen Gross
  2016-04-15  7:42     ` Dario Faggioli
  1 sibling, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2016-04-15  5:35 UTC (permalink / raw)
  To: Ian Jackson, xen-devel
  Cc: Wei Liu, George Dunlap, Dario Faggioli, Tim Deegan, Jan Beulich

On 14/04/16 19:07, Ian Jackson wrote:
> This is my attempt at understanding the situation, from reading
> descriptions provided on list in the context of toolstack patches
> which were attempting to work around the anomaly.
> 
> The multiple `xxx' entries reflect 1. my lack of complete understanding
> 2. API defects which I think I have identified.
> 
> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> CC: Dario Faggioli <dario.faggioli@citrix.com>
> CC: Juergen Gross <jgross@suse.com>
> CC: George Dunlap <george.dunlap@eu.citrix.com>
> CC: Jan Beulich <jbeulich@suse.com>
> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  xen/include/public/sysctl.h |   28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 0849908..cfccf38 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -560,6 +560,34 @@ struct xen_sysctl_cpupool_op {
>  typedef struct xen_sysctl_cpupool_op xen_sysctl_cpupool_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_cpupool_op_t);
>  
> +/*
> + * cpupool operations may return EBUSY if the operation cannot be
> + * executed right now because of another cpupool operation which is
> + * still in progress.  In this case, EBUSY means that the failed
> + * operation had no effect.
> + *
> + * Some operations including at least RMCPU (xxx which others?) may

Only this one.

> + * also return EBUSY because a guest has temporarily pinned one of its
> + * vcpus to the pcpu in question.  It is the pious hope (xxx) of the
> + * author of this comment that this can only occur for domains which
> + * have been granted some kind of hardware privilege (eg passthrough).

Only the hardware domain is allowed to do this.

> + * In this case the operation may have been partially carried out and
> + * the pcpu is left in an anomalous state.  In this state the pcpu may
> + * be used by some not readily predictable subset of the vcpus
> + * (domains) whose vcpus are in the old cpupool.  (xxx is this true?)

Yes.

> + *
> + * This can be detected by seeing whether the pcpu can be added to a
> + * different cpupool.  (xxx this is a silly interface; the situation
> + * should be reported by a different errno value, at least.)  If the

I'm open for suggestions. :-)

> + * pcpu can't be added to a different cpupool for this reason,
> + * attempts to do so will returning (xxx what errno value?)

You might have guessed: EBUSY

> + *
> + * The anomalous situation can be recovered by adding the pcpu back to
> + * the cpupool it came from (xxx this permits a buggy or malicious
> + * guest to prevent the cpu ever being removed from its cpupool).

Only the hardware domain, which has other means to inject problems.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15  5:35   ` Juergen Gross
@ 2016-04-15  7:42     ` Dario Faggioli
  2016-04-15 14:12       ` Ian Jackson
  0 siblings, 1 reply; 25+ messages in thread
From: Dario Faggioli @ 2016-04-15  7:42 UTC (permalink / raw)
  To: Juergen Gross, Ian Jackson, xen-devel
  Cc: George Dunlap, Tim Deegan, Wei Liu, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 3371 bytes --]

On Fri, 2016-04-15 at 07:35 +0200, Juergen Gross wrote:
> On 14/04/16 19:07, Ian Jackson wrote:
> > 
> > --- a/xen/include/public/sysctl.h
> > +++ b/xen/include/public/sysctl.h
> > @@ -560,6 +560,34 @@ struct xen_sysctl_cpupool_op {
> > 
> > + * In this case the operation may have been partially carried out
> > and
> > + * the pcpu is left in an anomalous state.  In this state the pcpu
> > may
> > + * be used by some not readily predictable subset of the vcpus
> > + * (domains) whose vcpus are in the old cpupool.  (xxx is this
> > true?)
> Yes.
> 
Well, I think it's a little less bad than that. the pcpu is no longer a
valid member of its cpupool. This means the scheduler will not run
vcpus of the domains of the pool (it's domains that are in cpupools)
any longer. And even the vcpus that are temporarily stuck on the pcpu
(i.e., the ones that are actually preventing the operation to succeed)
will rather quickly get away from it.

So, I'd say rather that "In this state the pcpu will, potentially after
a short transitory, just not be used by any vcpu"

> > + *
> > + * This can be detected by seeing whether the pcpu can be added to
> > a
> > + * different cpupool.  (xxx this is a silly interface; the
> > situation
> > + * should be reported by a different errno value, at least.)  If
> > the
> I'm open for suggestions. :-)
> 
Well, changing the system behavior with anything that involves retry
loops is indeed non-trivial (or, at least, I don't out of the top of my
head have an alternative).

However, a different return value for the super special case of
temporary pinning override could maybe be selected. I'm having a look,
and although I don't find one that could be seen as a perfect fit (that
would be EBUSY, which is taken!), what about one of these:

  EEXIST        17      /* File exists */
  ENOTEMPTY     39      /* Directory not empty */
  EROFS         30      /* Read-only file system */
  ENOSPC        28      /* No space left on device */

After all, as ugly as the resulting error message would be:
 - if's a very rare special case
 - we need to document this very rare special case anyway
 - we can log more info about the actual error

> > + * pcpu can't be added to a different cpupool for this reason,
> > + * attempts to do so will returning (xxx what errno value?)
> You might have guessed: EBUSY
> 
And here too, if the pcpu is not in cpupool_free_cpus, maybe we can go
and check whether it is in any of the existing cpupool's cpu_valid. If
it is, then EBUSY is ok. If not, it means it's in the inconsistent
state, and we can pick another (same as above?) error value, use it and
document the situation.

> > + *
> > + * The anomalous situation can be recovered by adding the pcpu
> > back to
> > + * the cpupool it came from (xxx this permits a buggy or malicious
> > + * guest to prevent the cpu ever being removed from its cpupool).
> Only the hardware domain, which has other means to inject problems.
> 
Yep, indeed.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case"
  2016-04-15  5:15   ` Juergen Gross
@ 2016-04-15  7:46     ` Dario Faggioli
  0 siblings, 0 replies; 25+ messages in thread
From: Dario Faggioli @ 2016-04-15  7:46 UTC (permalink / raw)
  To: Juergen Gross, Ian Jackson, xen-devel
  Cc: George Dunlap, Tim Deegan, Wei Liu, Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 2424 bytes --]

On Fri, 2016-04-15 at 07:15 +0200, Juergen Gross wrote:
> On 14/04/16 19:07, Ian Jackson wrote:
> > 
> > libxc may be called from within long-running daemons such as
> > libvirt.
> > 
> > In such a system this sleep would enable an uncooperative or buggy
> > guest to block all toolstack operations for an extended period.
> > 
> > Sadly, therefore, such a retry loop is not feasible without a lot
> > of
> > engineering which is probably not appropriate.
> I understand your concerns. OTOH you should consider that libvirt has
> no support for cpupool operations today, so it won't run this code.
> 
True, and it does not even have the concept of pooling (not in a
compatible way with what we provide with cpupools), so it probably
won't be start supporting them anytime soon.

HOWEVER, Ian's concern applies to any daemon based toolstack (i.e., not
necessarily libvirt) which may want to support cpupools, and would then
fall into this.

> > diff --git a/tools/libxc/xc_cpupool.c b/tools/libxc/xc_cpupool.c
> > index 261b9c9..c42273e 100644
> > --- a/tools/libxc/xc_cpupool.c
> > +++ b/tools/libxc/xc_cpupool.c
> > @@ -20,7 +20,6 @@
> > 
> >  int xc_cpupool_removecpu(xc_interface *xch,
> >                           uint32_t poolid,
> >                           int cpu)
> >  {
> > -    unsigned retries;
> > -    int err;
> >      DECLARE_SYSCTL;
> >  
> >      sysctl.cmd = XEN_SYSCTL_cpupool_op;
> >      sysctl.u.cpupool_op.op = XEN_SYSCTL_CPUPOOL_OP_RMCPU;
> >      sysctl.u.cpupool_op.cpupool_id = poolid;
> >      sysctl.u.cpupool_op.cpu = (cpu < 0) ?
> > XEN_SYSCTL_CPUPOOL_PAR_ANY : cpu;
> > -    for ( retries = 0; retries < NUM_RMCPU_BUSY_RETRIES; retries++
> > ) {
> > -        err = do_sysctl_save(xch, &sysctl);
> > -        if ( err < 0 && errno == EBUSY )
> > -            sleep(1);
> I'd rather just remove this sleep() call than the whole retry logic.
> EBUSY cases should be very very very very rare and last for some
> msecs or usecs only.
> 
+1

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-14 20:22       ` Dario Faggioli
@ 2016-04-15 10:20         ` Ian Jackson
  2016-04-15 10:43           ` Juergen Gross
  0 siblings, 1 reply; 25+ messages in thread
From: Ian Jackson @ 2016-04-15 10:20 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Juergen Gross, xen-devel, Wei Liu, George Dunlap, Andrew Cooper,
	Tim Deegan, Jan Beulich

Dario Faggioli writes ("Re: [Xen-devel] [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> On Thu, 2016-04-14 at 18:56 +0100, Ian Jackson wrote:
> > is very unfortunate.  How is a toolstack to know what to do ?
> > 
> Yeah, I agree, but again, I think in this case it's possible for
> toolstack to tell.

Thanks to both you and Juergen for the explanations.  I'm glad to see
that things weren't as bad as I feared - I admit that I didn't spend
the time to completely follow what all the relevant bits of code did.

Would either of you care to provide a version of my documentation patch
which answers the questions that my text answers ?  Or shall we commit
my version and you can edit it in-tree :-).

>         ret = -EPERM;
>         if ( !is_hardware_domain(current->domain) )
>             break;
> 
> which I think satisfies Ian's (legitimate) concern?

That addresses the security concern, yes.  (Is a driver domain a
`hardware domain' in this sense?)

All I need now is a recipe for the tools to tell what has happened and
then I can make xl or libxl at least print comprehensible and correct
error messages...

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op
  2016-04-14 20:37   ` Dario Faggioli
@ 2016-04-15 10:27     ` Ian Jackson
  0 siblings, 0 replies; 25+ messages in thread
From: Ian Jackson @ 2016-04-15 10:27 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Juergen Gross, xen-devel, Wei Liu, George Dunlap, Tim Deegan,
	Jan Beulich

Dario Faggioli writes ("Re: [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op"):
> On Thu, 2016-04-14 at 18:07 +0100, Ian Jackson wrote:
> > Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
> > CC: Jan Beulich <jbeulich@suse.com>
> > CC: Tim Deegan <tim@xen.org>
> >
> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
> 
> One thing, out of curiosity. This syntax, here:
> 
> > -/* XEN_SYSCTL_cpupool_op */
> > +/* ` enum XEN_SYSCTL_cpupool_op { */
...
> > +    uint32_t op;          /* IN ` enum XEN_SYSCTL_cpupool_op ` */
> >
> and here, is I guess useful to the hypercall HTML docs generator, as
> mentioned in the cover letter?

Yes.  Observe that here
  http://xenbits.xen.org/docs/unstable/hypercall/x86_64/index.html
does not contain an entry for the cpupool ops in the section
"Enums and sets of #defines".  And it doesn't even formally state that
the `op' in that struct is one of the XEN_SYSCTL_cpupool_op values -
although the reader will probably guess that that's the case.

The extra markup syntax is pretty minimal and is documented at the top
of the script
  docs/xen-headers

> It is not something we do in many other places (if at all, at least in
> this file)... If it is, I'll happily add to my TODO list to convert
> more entries to it.

Ideally I would like to see all hypercalls and all their arguments
documented properly.  That includes annotating them for the html
cross-reference generator.  But it also includes documenting their
semantics and their error conditions.

Unfortunately, we are starting from a rather sketchy baseline.

Thanks for the offer to help!

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 10:20         ` Ian Jackson
@ 2016-04-15 10:43           ` Juergen Gross
  2016-04-15 10:58             ` Dario Faggioli
  0 siblings, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2016-04-15 10:43 UTC (permalink / raw)
  To: Ian Jackson, Dario Faggioli
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jan Beulich

On 15/04/16 12:20, Ian Jackson wrote:
> Dario Faggioli writes ("Re: [Xen-devel] [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
>> On Thu, 2016-04-14 at 18:56 +0100, Ian Jackson wrote:
>>> is very unfortunate.  How is a toolstack to know what to do ?
>>>
>> Yeah, I agree, but again, I think in this case it's possible for
>> toolstack to tell.
> 
> Thanks to both you and Juergen for the explanations.  I'm glad to see
> that things weren't as bad as I feared - I admit that I didn't spend
> the time to completely follow what all the relevant bits of code did.
> 
> Would either of you care to provide a version of my documentation patch
> which answers the questions that my text answers ?  Or shall we commit
> my version and you can edit it in-tree :-).

I can provide an updated patch.

>>         ret = -EPERM;
>>         if ( !is_hardware_domain(current->domain) )
>>             break;
>>
>> which I think satisfies Ian's (legitimate) concern?
> 
> That addresses the security concern, yes.  (Is a driver domain a
> `hardware domain' in this sense?)

The hardware domain is the one domain having direct access to the
native hardware. This is dom0 in most cases. There can be only one
hardware domain.

> All I need now is a recipe for the tools to tell what has happened and
> then I can make xl or libxl at least print comprehensible and correct
> error messages...

So this boils down to finding an appropriate ESOMETHING replacement
for the EBUSY case introduced by the temporary pinning.

I think ENOTEMPTY or EADDRINUSE would fit best.

The EBUSY returns of not successful repair attempts (trying to assign a
cpu to another cpupool) should be changed to e.g. EADDRNOTAVAIL?


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 10:43           ` Juergen Gross
@ 2016-04-15 10:58             ` Dario Faggioli
  2016-04-15 11:37               ` Juergen Gross
  2016-04-15 13:20               ` Juergen Gross
  0 siblings, 2 replies; 25+ messages in thread
From: Dario Faggioli @ 2016-04-15 10:58 UTC (permalink / raw)
  To: Juergen Gross, Ian Jackson
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1536 bytes --]

On Fri, 2016-04-15 at 12:43 +0200, Juergen Gross wrote:
> On 15/04/16 12:20, Ian Jackson wrote:
> > 
> > Would either of you care to provide a version of my documentation
> > patch
> > which answers the questions that my text answers ?  Or shall we
> > commit
> > my version and you can edit it in-tree :-).
> I can provide an updated patch.
> 
Great.

> > All I need now is a recipe for the tools to tell what has happened
> > and
> > then I can make xl or libxl at least print comprehensible and
> > correct
> > error messages...
> So this boils down to finding an appropriate ESOMETHING replacement
> for the EBUSY case introduced by the temporary pinning.
> 
Exactly.

> I think ENOTEMPTY or EADDRINUSE would fit best.
>
I like the latter better.
> 
> The EBUSY returns of not successful repair attempts (trying to assign
> a
> cpu to another cpupool) should be changed to e.g. EADDRNOTAVAIL?
> 
I'd go for EADDRINUSE and EADDRNOTAVAIL so the two error values are
similarly *wrong* hinting an addressing issue, which is more consistent
(and would come handy when documenting) than having one pointing at the
filesystem and the other at the address space.

Are you going to do the patch for this yourself as well?

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 10:58             ` Dario Faggioli
@ 2016-04-15 11:37               ` Juergen Gross
  2016-04-15 13:20               ` Juergen Gross
  1 sibling, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2016-04-15 11:37 UTC (permalink / raw)
  To: Dario Faggioli, Ian Jackson
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jan Beulich

On 15/04/16 12:58, Dario Faggioli wrote:
> On Fri, 2016-04-15 at 12:43 +0200, Juergen Gross wrote:
>> On 15/04/16 12:20, Ian Jackson wrote:
>>>
>>> Would either of you care to provide a version of my documentation
>>> patch
>>> which answers the questions that my text answers ?  Or shall we
>>> commit
>>> my version and you can edit it in-tree :-).
>> I can provide an updated patch.
>>
> Great.
> 
>>> All I need now is a recipe for the tools to tell what has happened
>>> and
>>> then I can make xl or libxl at least print comprehensible and
>>> correct
>>> error messages...
>> So this boils down to finding an appropriate ESOMETHING replacement
>> for the EBUSY case introduced by the temporary pinning.
>>
> Exactly.
> 
>> I think ENOTEMPTY or EADDRINUSE would fit best.
>>
> I like the latter better.
>>
>> The EBUSY returns of not successful repair attempts (trying to assign
>> a
>> cpu to another cpupool) should be changed to e.g. EADDRNOTAVAIL?
>>
> I'd go for EADDRINUSE and EADDRNOTAVAIL so the two error values are
> similarly *wrong* hinting an addressing issue, which is more consistent
> (and would come handy when documenting) than having one pointing at the
> filesystem and the other at the address space.
> 
> Are you going to do the patch for this yourself as well?

Sure.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 10:58             ` Dario Faggioli
  2016-04-15 11:37               ` Juergen Gross
@ 2016-04-15 13:20               ` Juergen Gross
  2016-04-15 13:33                 ` Dario Faggioli
  2016-04-15 14:11                 ` Ian Jackson
  1 sibling, 2 replies; 25+ messages in thread
From: Juergen Gross @ 2016-04-15 13:20 UTC (permalink / raw)
  To: Dario Faggioli, Ian Jackson
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jan Beulich

On 15/04/16 12:58, Dario Faggioli wrote:
> On Fri, 2016-04-15 at 12:43 +0200, Juergen Gross wrote:
>> On 15/04/16 12:20, Ian Jackson wrote:
>>>
>>> Would either of you care to provide a version of my documentation
>>> patch
>>> which answers the questions that my text answers ?  Or shall we
>>> commit
>>> my version and you can edit it in-tree :-).
>> I can provide an updated patch.
>>
> Great.
> 
>>> All I need now is a recipe for the tools to tell what has happened
>>> and
>>> then I can make xl or libxl at least print comprehensible and
>>> correct
>>> error messages...
>> So this boils down to finding an appropriate ESOMETHING replacement
>> for the EBUSY case introduced by the temporary pinning.
>>
> Exactly.
> 
>> I think ENOTEMPTY or EADDRINUSE would fit best.
>>
> I like the latter better.
>>
>> The EBUSY returns of not successful repair attempts (trying to assign
>> a
>> cpu to another cpupool) should be changed to e.g. EADDRNOTAVAIL?
>>
> I'd go for EADDRINUSE and EADDRNOTAVAIL so the two error values are
> similarly *wrong* hinting an addressing issue, which is more consistent
> (and would come handy when documenting) than having one pointing at the
> filesystem and the other at the address space.

Just saw there are still two cases left where EBUSY will be returned:

- when trying to remove the last cpu from a cpupool with active domains
  (EBUSY seems appropriate)
- when trying to move a cpu which is not available in the source cpupool
  or free cpus (I'd suggest ENODEV in this case)


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 13:20               ` Juergen Gross
@ 2016-04-15 13:33                 ` Dario Faggioli
  2016-04-15 14:11                 ` Ian Jackson
  1 sibling, 0 replies; 25+ messages in thread
From: Dario Faggioli @ 2016-04-15 13:33 UTC (permalink / raw)
  To: Juergen Gross, Ian Jackson
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Tim Deegan,
	Jan Beulich


[-- Attachment #1.1: Type: text/plain, Size: 1667 bytes --]

On Fri, 2016-04-15 at 15:20 +0200, Juergen Gross wrote:
> On 15/04/16 12:58, Dario Faggioli wrote:
> > 
> > On Fri, 2016-04-15 at 12:43 +0200, Juergen Gross wrote:
> > > 
> > > The EBUSY returns of not successful repair attempts (trying to
> > > assign
> > > a
> > > cpu to another cpupool) should be changed to e.g. EADDRNOTAVAIL?
> > > 
> > I'd go for EADDRINUSE and EADDRNOTAVAIL so the two error values are
> > similarly *wrong* hinting an addressing issue, which is more
> > consistent
> > (and would come handy when documenting) than having one pointing at
> > the
> > filesystem and the other at the address space.
> Just saw there are still two cases left where EBUSY will be returned:
> 
I know...

> - when trying to remove the last cpu from a cpupool with active
> domains
>   (EBUSY seems appropriate)
>
Indeed.

> - when trying to move a cpu which is not available in the source
> cpupool
>   or free cpus (I'd suggest ENODEV in this case)
> 
Well, in practice, I don't actually think this is too big of a deal. In
fact, I don't think that tools can or need to differentiate their
behavior too much between this and, for instance, the above mentioned
failure.

_However_, if while you're there you're up for making this case
returning ENODEV, it would indeed look like it's more descriptive of
the actual problem, and hence better.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)


[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 13:20               ` Juergen Gross
  2016-04-15 13:33                 ` Dario Faggioli
@ 2016-04-15 14:11                 ` Ian Jackson
  2016-04-15 14:39                   ` Juergen Gross
  1 sibling, 1 reply; 25+ messages in thread
From: Ian Jackson @ 2016-04-15 14:11 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Dario Faggioli,
	Tim Deegan, Jan Beulich

Juergen Gross writes ("Re: [Xen-devel] [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> Just saw there are still two cases left where EBUSY will be returned:
> 
> - when trying to remove the last cpu from a cpupool with active domains
>   (EBUSY seems appropriate)

The problem is that EBUSY here might mean "this operation can't be
done because it amounts to a request for a configuration which
violates an invariant" or "this operation can't be done for temporary
reasons and you should retry it".

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15  7:42     ` Dario Faggioli
@ 2016-04-15 14:12       ` Ian Jackson
  2016-04-15 14:17         ` Konrad Rzeszutek Wilk
  2016-04-15 14:34         ` Juergen Gross
  0 siblings, 2 replies; 25+ messages in thread
From: Ian Jackson @ 2016-04-15 14:12 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Juergen Gross, xen-devel, Wei Liu, George Dunlap, Tim Deegan,
	Jan Beulich

Dario Faggioli writes ("Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> However, a different return value for the super special case of
> temporary pinning override could maybe be selected. I'm having a look,
> and although I don't find one that could be seen as a perfect fit (that
> would be EBUSY, which is taken!), what about one of these:
> 
>   EEXIST        17      /* File exists */
>   ENOTEMPTY     39      /* Directory not empty */
>   EROFS         30      /* Read-only file system */
>   ENOSPC        28      /* No space left on device */

Does the hypervisor currently use EAGAIN for anything ?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 14:12       ` Ian Jackson
@ 2016-04-15 14:17         ` Konrad Rzeszutek Wilk
  2016-04-15 14:34         ` Juergen Gross
  1 sibling, 0 replies; 25+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-04-15 14:17 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Juergen Gross, xen-devel, Wei Liu, George Dunlap, Dario Faggioli,
	Tim Deegan, Jan Beulich

On Fri, Apr 15, 2016 at 03:12:56PM +0100, Ian Jackson wrote:
> Dario Faggioli writes ("Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> > However, a different return value for the super special case of
> > temporary pinning override could maybe be selected. I'm having a look,
> > and although I don't find one that could be seen as a perfect fit (that
> > would be EBUSY, which is taken!), what about one of these:
> > 
> >   EEXIST        17      /* File exists */
> >   ENOTEMPTY     39      /* Directory not empty */
> >   EROFS         30      /* Read-only file system */
> >   ENOSPC        28      /* No space left on device */
> 
> Does the hypervisor currently use EAGAIN for anything ?

Yes. XEN_SYSCTL_pm_op, XSM, HVMOP_get_mem_type, MAP_PIRQ_TYPE_MULTI_MSI (ah
scratch that, it makes it -EINVAL).
> 
> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 14:12       ` Ian Jackson
  2016-04-15 14:17         ` Konrad Rzeszutek Wilk
@ 2016-04-15 14:34         ` Juergen Gross
  2016-04-15 14:44           ` Ian Jackson
  1 sibling, 1 reply; 25+ messages in thread
From: Juergen Gross @ 2016-04-15 14:34 UTC (permalink / raw)
  To: Ian Jackson, Dario Faggioli
  Cc: xen-devel, Wei Liu, George Dunlap, Tim Deegan, Jan Beulich

On 15/04/16 16:12, Ian Jackson wrote:
> Dario Faggioli writes ("Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
>> However, a different return value for the super special case of
>> temporary pinning override could maybe be selected. I'm having a look,
>> and although I don't find one that could be seen as a perfect fit (that
>> would be EBUSY, which is taken!), what about one of these:
>>
>>   EEXIST        17      /* File exists */
>>   ENOTEMPTY     39      /* Directory not empty */
>>   EROFS         30      /* Read-only file system */
>>   ENOSPC        28      /* No space left on device */
> 
> Does the hypervisor currently use EAGAIN for anything ?

Yes. Especially XEN_SYSCTL_CPUPOOL_OP_RMCPU can return it already.
Before you are asking: Jan didn't want it to return EAGAIN in the
temporary pinning case as the hypervisor couldn't guarantee that
the situation will be resolved (opposed to the other usage of
EAGAIN).


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 14:11                 ` Ian Jackson
@ 2016-04-15 14:39                   ` Juergen Gross
  0 siblings, 0 replies; 25+ messages in thread
From: Juergen Gross @ 2016-04-15 14:39 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Wei Liu, George Dunlap, Andrew Cooper, Dario Faggioli,
	Tim Deegan, Jan Beulich

On 15/04/16 16:11, Ian Jackson wrote:
> Juergen Gross writes ("Re: [Xen-devel] [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
>> Just saw there are still two cases left where EBUSY will be returned:
>>
>> - when trying to remove the last cpu from a cpupool with active domains
>>   (EBUSY seems appropriate)
> 
> The problem is that EBUSY here might mean "this operation can't be
> done because it amounts to a request for a configuration which
> violates an invariant" or "this operation can't be done for temporary
> reasons and you should retry it".

I'm not sure I understand your comment correctly: do you see a problem
with EBUSY being returned in the two cases I mentioned or just in the
one case you cited?


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result
  2016-04-15 14:34         ` Juergen Gross
@ 2016-04-15 14:44           ` Ian Jackson
  0 siblings, 0 replies; 25+ messages in thread
From: Ian Jackson @ 2016-04-15 14:44 UTC (permalink / raw)
  To: Juergen Gross
  Cc: xen-devel, Wei Liu, George Dunlap, Dario Faggioli, Tim Deegan,
	Jan Beulich

Juergen Gross writes ("Re: [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result"):
> On 15/04/16 16:12, Ian Jackson wrote:
> > Does the hypervisor currently use EAGAIN for anything ?
> 
> Yes. Especially XEN_SYSCTL_CPUPOOL_OP_RMCPU can return it already.

That is a good reason not to use it.

> Before you are asking: Jan didn't want it to return EAGAIN in the
> temporary pinning case as the hypervisor couldn't guarantee that
> the situation will be resolved (opposed to the other usage of
> EAGAIN).

Hrm.  I note that ENOLCK is in the list in errno.h but not used
anywhere.  EISCONN is another possibility...

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-04-15 14:44 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-14 17:07 [PATCH 0/3] Revert xc cpupool retries and document anomaly Ian Jackson
2016-04-14 17:07 ` [PATCH 1/3] libxc: Revert "do some retries in xc_cpupool_removecpu() for EBUSY case" Ian Jackson
2016-04-15  5:15   ` Juergen Gross
2016-04-15  7:46     ` Dario Faggioli
2016-04-14 17:07 ` [PATCH 2/3] xen: hypercall docs annotations for xen_sysctl_cpupool_op Ian Jackson
2016-04-14 20:37   ` Dario Faggioli
2016-04-15 10:27     ` Ian Jackson
2016-04-14 17:07 ` [PATCH 3/3] xen: Document XEN_SYSCTL_CPUPOOL_OP_RMCPU anomalous EBUSY result Ian Jackson
2016-04-14 17:24   ` Andrew Cooper
2016-04-14 17:56     ` Ian Jackson
2016-04-14 20:22       ` Dario Faggioli
2016-04-15 10:20         ` Ian Jackson
2016-04-15 10:43           ` Juergen Gross
2016-04-15 10:58             ` Dario Faggioli
2016-04-15 11:37               ` Juergen Gross
2016-04-15 13:20               ` Juergen Gross
2016-04-15 13:33                 ` Dario Faggioli
2016-04-15 14:11                 ` Ian Jackson
2016-04-15 14:39                   ` Juergen Gross
2016-04-15  5:35   ` Juergen Gross
2016-04-15  7:42     ` Dario Faggioli
2016-04-15 14:12       ` Ian Jackson
2016-04-15 14:17         ` Konrad Rzeszutek Wilk
2016-04-15 14:34         ` Juergen Gross
2016-04-15 14:44           ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.