All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: FW: Cancelling asynchronous operations in libxl
       [not found] <4B8F5D33B081C044AA43634E84ED7F9616A83D@AMSPEX01CL03.citrite.net>
@ 2013-10-23 17:23 ` Konrad Rzeszutek Wilk
  2013-10-26  8:33   ` Ian Campbell
       [not found]   ` <1382776392.22417.179.camel@hastur.hellion.org.uk>
  0 siblings, 2 replies; 29+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-10-23 17:23 UTC (permalink / raw)
  To: Simon Beaumont; +Cc: xs-devel, xen-devel

On Tue, Oct 15, 2013 at 04:08:31PM +0000, Simon Beaumont wrote:
> We're in the process of porting xenopsd[1] to libxl, rather than driving libxc manually.
> 
> Using our libxc-based backend, we are able to cancel operations. For operations that are using XenStore watches we wrap these in a cancellable_watch[2], and for operations that make use of a subprocess we send SIGKILL when we wish to cancel the associated task. We would then instrument any necessary cleanup by hand on a best-effort basis.
> 
> With the move to libxl, a lot of this control will be abstracted away and it is unclear how best to allow long-running tasks to be cancelled. It seems most of these operations could be executed asynchronously but we wonder if it is possible to cancel them, or how we could add cancellation functionality to these operations?

Are there specific operations that you are focusing on to cancel?

Or is this mostly of 'the guest is doing something and does not seem
to have a vnc/console, lets kill it' ?

> 
> Cheers,
> 
> Si
> 
> [1]: https://github.com/xapi-project/xenopsd
> [2]: https://github.com/xapi-project/xenopsd/blob/master/xc/cancel_utils.ml#L101
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-10-23 17:23 ` FW: Cancelling asynchronous operations in libxl Konrad Rzeszutek Wilk
@ 2013-10-26  8:33   ` Ian Campbell
       [not found]   ` <1382776392.22417.179.camel@hastur.hellion.org.uk>
  1 sibling, 0 replies; 29+ messages in thread
From: Ian Campbell @ 2013-10-26  8:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: Simon Beaumont, xs-devel, Ian Jackson, xen-devel

On Wed, 2013-10-23 at 13:23 -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Oct 15, 2013 at 04:08:31PM +0000, Simon Beaumont wrote:
> > We're in the process of porting xenopsd[1] to libxl, rather than
> driving libxc manually.
> > 
> > Using our libxc-based backend, we are able to cancel operations. For
> operations that are using XenStore watches we wrap these in a
> cancellable_watch[2], and for operations that make use of a subprocess
> we send SIGKILL when we wish to cancel the associated task. We would
> then instrument any necessary cleanup by hand on a best-effort basis.
> > 
> > With the move to libxl, a lot of this control will be abstracted
> away and it is unclear how best to allow long-running tasks to be
> cancelled. It seems most of these operations could be executed
> asynchronously but we wonder if it is possible to cancel them, or how
> we could add cancellation functionality to these operations?
> 
> Are there specific operations that you are focusing on to cancel?

Yes, this would be good to know. Ian J would know better than I but AIUI
there is currently no generic support for cancelling an asynchronous op
in libxl. I'm sure we could find a way to add this, for example as an
abort after the current "step" succeeds/times out/fails.

> Or is this mostly of 'the guest is doing something and does not seem
> to have a vnc/console, lets kill it' ?

For killing libxl_domain_destroy should be fine to call even if there is
an async op pending on that domain, the async op will naturally fail,
it'll probably log a lot but should do so gracefully. If there are cases
which do not we should fix them, since they would represent a bug
irrespective of any killing of the domain from under the op's feet.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
       [not found]   ` <1382776392.22417.179.camel@hastur.hellion.org.uk>
@ 2013-10-28  9:38     ` Simon Beaumont
  2013-10-28 15:52     ` Ian Jackson
  1 sibling, 0 replies; 29+ messages in thread
From: Simon Beaumont @ 2013-10-28  9:38 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Ian Jackson, xs-devel, xen-devel, Rob Hoes


On 26 Oct 2013, at 09:33, Ian Campbell <ian.campbell@citrix.com> wrote:

> On Wed, 2013-10-23 at 13:23 -0400, Konrad Rzeszutek Wilk wrote:
>> On Tue, Oct 15, 2013 at 04:08:31PM +0000, Simon Beaumont wrote:
>> Are there specific operations that you are focusing on to cancel?
> 
> Yes, this would be good to know. Ian J would know better than I but AIUI
> there is currently no generic support for cancelling an asynchronous op
> in libxl. I'm sure we could find a way to add this, for example as an
> abort after the current "step" succeeds/times out/fails.

I think this generic support is what we’re looking for. Ideally, we’d like to be able to cancel any potentially long-running function (i.e. those that can be called asynchronously).

More specifically, we’d like to be able to cancel at least the following:

* device_*_[add|remove|destroy]
* device_pci_assignable_[add|remove]
* libxl_domain_create_new
* libxl_domain_create_resume
* libxl_domain_shutdown
* libxl_domain_wait_shutdown
* libxl_domain_reboot
* libxl_domain_destroy
* libxl_domain_suspend
* libxl_domain_pause
* libxl_domain_unpause

>> Or is this mostly of 'the guest is doing something and does not seem
>> to have a vnc/console, lets kill it' ?
> 
> For killing libxl_domain_destroy should be fine to call even if there is
> an async op pending on that domain, the async op will naturally fail,
> it'll probably log a lot but should do so gracefully. If there are cases
> which do not we should fix them, since they would represent a bug
> irrespective of any killing of the domain from under the op's feet.

Not so much interested in this use case.

Si

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
       [not found]   ` <1382776392.22417.179.camel@hastur.hellion.org.uk>
  2013-10-28  9:38     ` Simon Beaumont
@ 2013-10-28 15:52     ` Ian Jackson
  2013-10-31 13:52       ` Ian Campbell
  1 sibling, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-10-28 15:52 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Simon Beaumont, xs-devel, xen-devel

Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> On Wed, 2013-10-23 at 13:23 -0400, Konrad Rzeszutek Wilk wrote:
> > Are there specific operations that you are focusing on to cancel?
> 
> Yes, this would be good to know. Ian J would know better than I but AIUI
> there is currently no generic support for cancelling an asynchronous op
> in libxl. I'm sure we could find a way to add this, for example as an
> abort after the current "step" succeeds/times out/fails.

That's right.

I think the right approach would be to provide a new kind of ao_how
value which returns a reference to the caller which can be used for
cancellation.

Cancellation would itself be asynchronous, in the sense that you sould
request cancellation but then still have to wait for the operation to
end.

So initially you could implement cancellation for only a subset of
operations and have the infrastructure disregard cancellation requests
for operations which don't support it.  I think the most obvious
operation that should be cancellable is domain migration.

Ideally there would be some kind of automatic plumbing which would
arrange (for example) that callers to the libxl xs watch functions, or
timeout functions, would get a "cancellation" notice instead.

> For killing libxl_domain_destroy should be fine to call even if there is
> an async op pending on that domain, the async op will naturally fail,
> it'll probably log a lot but should do so gracefully. If there are cases
> which do not we should fix them, since they would represent a bug
> irrespective of any killing of the domain from under the op's feet.

Yes.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-10-28 15:52     ` Ian Jackson
@ 2013-10-31 13:52       ` Ian Campbell
  2013-10-31 14:32         ` Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-10-31 13:52 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Simon Beaumont, xs-devel, xen-devel

On Mon, 2013-10-28 at 15:52 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> > On Wed, 2013-10-23 at 13:23 -0400, Konrad Rzeszutek Wilk wrote:
> > > Are there specific operations that you are focusing on to cancel?
> > 
> > Yes, this would be good to know. Ian J would know better than I but AIUI
> > there is currently no generic support for cancelling an asynchronous op
> > in libxl. I'm sure we could find a way to add this, for example as an
> > abort after the current "step" succeeds/times out/fails.
> 
> That's right.
> 
> I think the right approach would be to provide a new kind of ao_how
> value which returns a reference to the caller which can be used for
> cancellation.
> 
> Cancellation would itself be asynchronous, in the sense that you sould
> request cancellation but then still have to wait for the operation to
> end.
> 
> So initially you could implement cancellation for only a subset of
> operations and have the infrastructure disregard cancellation requests
> for operations which don't support it.  I think the most obvious
> operation that should be cancellable is domain migration.
> 
> Ideally there would be some kind of automatic plumbing which would
> arrange (for example) that callers to the libxl xs watch functions, or
> timeout functions, would get a "cancellation" notice instead.

The way in which sequential async subops are currently handled (by
chaining callbacks) is a bit ugly -- it leads to weird error messages
like "libxl__vtpm_foo: Failed" which actually means "couldn't add a
NIC" (because that was the previous step, and libxl__vtpm_foo is what
picks up the result). Do you think it would be feasible to recast those
as sequences of function pointers in an array with a central (common)
dispatcher which steps through, handling and reporting errors?

If things could be restructured along those lines then adding
cancellation support to the dispatcher might get us reasonably good API
coverage pretty easily

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-10-31 13:52       ` Ian Campbell
@ 2013-10-31 14:32         ` Ian Jackson
  2013-10-31 17:09           ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-10-31 14:32 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Simon Beaumont, xs-devel, xen-devel

Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> The way in which sequential async subops are currently handled (by
> chaining callbacks) is a bit ugly -- it leads to weird error messages
> like "libxl__vtpm_foo: Failed" which actually means "couldn't add a
> NIC" (because that was the previous step, and libxl__vtpm_foo is what
> picks up the result). Do you think it would be feasible to recast those
> as sequences of function pointers in an array with a central (common)
> dispatcher which steps through, handling and reporting errors?

Hmmm.  It might be.  There's a problem in that they all tend to have
different arguments.  And the control flow would be split into two
places.

Maybe it would be better to improve the error reporting to not rely on
the function name (which is TBH pretty pants anyway).

> If things could be restructured along those lines then adding
> cancellation support to the dispatcher might get us reasonably good API
> coverage pretty easily

Hmm.

Writing the control flow of all of these in a declarative style would
be difficult.  There are a lot of ifs and some parallel loops, as well
as simple sequential execution.

I think we would be in danger of writing a full-on coroutine system.
Maybe that wouldn't be such a bad idea.

Hmm.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-10-31 14:32         ` Ian Jackson
@ 2013-10-31 17:09           ` Ian Campbell
  2013-11-08 18:38             ` Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-10-31 17:09 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Simon Beaumont, xs-devel, xen-devel

On Thu, 2013-10-31 at 14:32 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> > The way in which sequential async subops are currently handled (by
> > chaining callbacks) is a bit ugly -- it leads to weird error messages
> > like "libxl__vtpm_foo: Failed" which actually means "couldn't add a
> > NIC" (because that was the previous step, and libxl__vtpm_foo is what
> > picks up the result). Do you think it would be feasible to recast those
> > as sequences of function pointers in an array with a central (common)
> > dispatcher which steps through, handling and reporting errors?
> 
> Hmmm.  It might be.  There's a problem in that they all tend to have
> different arguments.  And the control flow would be split into two
> places.
> 
> Maybe it would be better to improve the error reporting to not rely on
> the function name (which is TBH pretty pants anyway).

Yes. Even just adding a foo_done step would be better. e.g.:

do_nic_stuff -> nic_stuff_done (check nic errors) -> do_vtpm_stuff(vtpm
stuff)

rather than do_nic_stuff -> do_vtpm_stuf(check nic errors, vtpm stuff)
> 
> > If things could be restructured along those lines then adding
> > cancellation support to the dispatcher might get us reasonably good API
> > coverage pretty easily
> 
> Hmm.
> 
> Writing the control flow of all of these in a declarative style would
> be difficult.  There are a lot of ifs and some parallel loops, as well
> as simple sequential execution.

ifs seems like the easier case, you call the function and it returns a
code saying "done" without doing anything (if its that sort of if)
instead of "async work started" or you handle the two cases in the
function etc.

Don't the parallel loops coalesce into a single "everything done"
callback at some point? That seems to obvious place to resync with the
dispatcher.

> I think we would be in danger of writing a full-on coroutine system.
> Maybe that wouldn't be such a bad idea.

Indeed ;-)

> 
> Hmm.
> 
> Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-10-31 17:09           ` Ian Campbell
@ 2013-11-08 18:38             ` Ian Jackson
  2013-11-20 11:01               ` Ian Campbell
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Jackson @ 2013-11-08 18:38 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Simon Beaumont, xs-devel, xen-devel

I've been thinking about this some more and looking at the code.

I have the following sketch of an approach:

 * Somehow the ao_how API is changed to make it possible to return the
   ao to the caller (in the case of an asynchronous ao).

   (NB that there could be exciting races in the application between
   completing the ao and cancelling it; this means that the
   application can only use cancellation if it uses the callback
   mechanism and must make sure that the callback takes a lock and
   then makes changes to its data structurs which prevent the ao being
   cancelled.  As an alternative we could invent a separate
   "cancellation handle" which can be detached from the ao but which
   must always be explicitly destroyed by the application.)

   Suggestions for the API welcome.  The most obvious approach simply
   adds a new field to libxl_asyncop_how but that risks lots of
   existing code failing to initialise it.

 * Keep a list of cancellation hooks in the ao.  Anything which is
   using this ao can add itself to that set of hooks.

 * Cancellation involves repeatedly taking the front of that list off,
   and calling the hook.  (After the ao has been cancelled, its
   completion still needs to be awaited by the application, but it
   will hopefully complete earlier and return ERROR_CANCEL.)

 * The timeout registration facility is changed to take an ao and
   register a cancellation hook.  It is changed to provide an rc value
   to its callback, which will be FAIL or CANCEL.

 * The fork machinery is changed to take an ao, and register a
   cancellation hook.  A suitable-for-default-uses cancellation hook
   function is provided which sends SIGKILL to the child and makes a
   note that this has happened.  The child death callback provides an
   rc value (0 for status==0, or FAIL or CANCEL) for the convenience
   of the next layer up.

 * A new version of the xswatch event registration machinery is
   provided which takes an ao, registers itself as cancellation hooks,
   and provides an rc value to its callbacks.  This new facility could
   usefully do an xs_read on a predefined path.  The rc value will be
   OK or CANCEL.  (We need new versions of this because some xswatch
   callers are part of the infrastructure or libxl application event
   generation, not aos.)

 * Anything which uses the fd machinery directly needs to do
   cancellation itself (or ensure that it has a timeout, an xs watch,
   or a child).

A tricky question arises regarding cleanup: for example, if
libxl_domain_create_* were cancelled.  It would end up in
domcreate_complete with rc==CANCEL.

Should it now run the domain destruction ?  How would the caller say
they wanted that cancelled, if that too was taking too long ?  Perhaps
there should be a progress callback to say "we have finished
cancelling the first thing and are now cleaning up".

Or perhaps cancelling the operation should simply skip the destruction
and return the domid to the caller.  (But also, fiddly edge case:
consider what happens if a failed creation, which is already being
destroyed, is cancelled.)

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-11-08 18:38             ` Ian Jackson
@ 2013-11-20 11:01               ` Ian Campbell
  2013-12-20 18:24                 ` Ian Jackson
  0 siblings, 1 reply; 29+ messages in thread
From: Ian Campbell @ 2013-11-20 11:01 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Simon Beaumont, xs-devel, xen-devel

On Fri, 2013-11-08 at 18:38 +0000, Ian Jackson wrote:
> I've been thinking about this some more and looking at the code.
> 
> I have the following sketch of an approach:
> 
>  * Somehow the ao_how API is changed to make it possible to return the
>    ao to the caller (in the case of an asynchronous ao).
> 
>    (NB that there could be exciting races in the application between
>    completing the ao and cancelling it; this means that the
>    application can only use cancellation if it uses the callback
>    mechanism and must make sure that the callback takes a lock and
>    then makes changes to its data structurs which prevent the ao being
>    cancelled.  As an alternative we could invent a separate
>    "cancellation handle" which can be detached from the ao but which
>    must always be explicitly destroyed by the application.)

That sounds like the sort of cleanup operation which callers would be
pretty good at forgetting to do ;-) Or maybe it would be handled by
generic code in the callers and it would be fine.

>    Suggestions for the API welcome.  The most obvious approach simply
>    adds a new field to libxl_asyncop_how but that risks lots of
>    existing code failing to initialise it.

Should we have supplied libxl_asyncop_how_init functions? It's perhaps
not too late to do so.

The following 6 things are all internal to libxl, right?

>  * Keep a list of cancellation hooks in the ao.  Anything which is
>    using this ao can add itself to that set of hooks.
> 
>  * Cancellation involves repeatedly taking the front of that list off,
>    and calling the hook.  (After the ao has been cancelled, its
>    completion still needs to be awaited by the application, but it
>    will hopefully complete earlier and return ERROR_CANCEL.)
> 
>  * The timeout registration facility is changed to take an ao and
>    register a cancellation hook.  It is changed to provide an rc value
>    to its callback, which will be FAIL or CANCEL.
> 
>  * The fork machinery is changed to take an ao, and register a
>    cancellation hook.  A suitable-for-default-uses cancellation hook
>    function is provided which sends SIGKILL to the child and makes a
>    note that this has happened.  The child death callback provides an
>    rc value (0 for status==0, or FAIL or CANCEL) for the convenience
>    of the next layer up.
> 
>  * A new version of the xswatch event registration machinery is
>    provided which takes an ao, registers itself as cancellation hooks,
>    and provides an rc value to its callbacks.  This new facility could
>    usefully do an xs_read on a predefined path.  The rc value will be
>    OK or CANCEL.  (We need new versions of this because some xswatch
>    callers are part of the infrastructure or libxl application event
>    generation, not aos.)
> 
>  * Anything which uses the fd machinery directly needs to do
>    cancellation itself (or ensure that it has a timeout, an xs watch,
>    or a child).

All sounds plausible to me.

> A tricky question arises regarding cleanup: for example, if
> libxl_domain_create_* were cancelled.  It would end up in
> domcreate_complete with rc==CANCEL.
> 
> Should it now run the domain destruction ?  How would the caller say
> they wanted that cancelled, if that too was taking too long ?  Perhaps
> there should be a progress callback to say "we have finished
> cancelling the first thing and are now cleaning up".
> 
> Or perhaps cancelling the operation should simply skip the destruction
> and return the domid to the caller.  (But also, fiddly edge case:
> consider what happens if a failed creation, which is already being
> destroyed, is cancelled.)

I think either approach could be made to work, the question is how much
complexity we are willing to put into libxl vs the applications for
this. On first glance it seems that by putting a small amount of
complexity into the apps (you must destroy on cancel) we avoid a
potentially large amount in the library.

It seems to also make sense to allow the callers to manage the
destruction as a separate cancellable event which they can manage
independently if they wish.

Probably we should ask the existing potential users (xapi & libvirt?)
what they would prefer.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-11-20 11:01               ` Ian Campbell
@ 2013-12-20 18:24                 ` Ian Jackson
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
  2014-03-14 10:42                   ` FW: Cancelling asynchronous operations in libxl Ian Campbell
  0 siblings, 2 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:24 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Simon Beaumont, xs-devel, xen-devel

Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> On Fri, 2013-11-08 at 18:38 +0000, Ian Jackson wrote:
> > I've been thinking about this some more and looking at the code.
> > 
> > I have the following sketch of an approach:

I have now an RFC patch series, which I'm going to send as a reply to
this email.

> >  * Somehow the ao_how API is changed to make it possible to return the
> >    ao to the caller (in the case of an asynchronous ao).

I did this differently.

The caller is expected to provide a copy of their previous ao_how,
which hopefully is sufficiently unique (this is the responsibility of
the application).  This seems to deal satisfactorily with the lifetime
issues.

> The following 6 things are all internal to libxl, right?
> [stuff]

Yes.

> >  * The fork machinery is changed to take an ao, and register a
> >    cancellation hook.  A suitable-for-default-uses cancellation hook
> >    function is provided which sends SIGKILL to the child and makes a
> >    note that this has happened.  The child death callback provides an
> >    rc value (0 for status==0, or FAIL or CANCEL) for the convenience
> >    of the next layer up.

I have not yet done this in my RFC series.

> > A tricky question arises regarding cleanup: for example, if
> > libxl_domain_create_* were cancelled.  It would end up in
> > domcreate_complete with rc==CANCEL.
> > 
> > Should it now run the domain destruction ?  How would the caller say
> > they wanted that cancelled, if that too was taking too long ?  Perhaps
> > there should be a progress callback to say "we have finished
> > cancelling the first thing and are now cleaning up".

I decided to not do the destruction in the cancellation case.

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [RFC PATCH 00/14] libxl: Asynchronous event cancellation
  2013-12-20 18:24                 ` Ian Jackson
@ 2013-12-20 18:45                   ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 01/14] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
                                       ` (13 more replies)
  2014-03-14 10:42                   ` FW: Cancelling asynchronous operations in libxl Ian Campbell
  1 sibling, 14 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Campbell

Comments on my approach would be very welcome at this stage.
This applies only on top of my async suspend series.

I have compiled it but I HAVE NOT EXECUTED IT.

There is nothing to exercise this.  I think it might be possible to
make xl have a SIGINT handler which DTRT, but only at the cost of
making it multithreaded or giving it its own event loop.

Preparatory work on the suspend code's dodgy error handling:
 01/14 libxl: suspend: switch_logdirty_done takes rc
 02/14 libxl: suspend: common suspend callbacks take rc
 03/14 libxl: suspend: Return correct error from callbacks

Switching to libxl__xswait* to reduce number of timeout call sites:
 04/14 libxl: Use libxl__xswait* in libxl__ao_device
 05/14 libxl: xswait/devstate: Move xswait to before devstate
 06/14 libxl: devstate: Use libxl__xswait*

Public API changes:
 07/14 libxl: New error codes CANCELLED etc.
 09/14 libxl: domain create: Do not destroy on cancellation
 13/14 libxl: ao: Cancellation API

Internal API changes:
 08/14 libxl: events: Permit timeouts to signal cancellation
 10/14 libxl: ao: Record ultimate parent of a nested ao
 11/14 libxl: ao: Count the nested progeny of an ao
 12/14 libxl: ao: Provide manip_refcnt

Actual functionality!
 14/14 libxl: ao: Timeouts are cancellable

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/14] libxl: suspend: switch_logdirty_done takes rc
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 02/14] libxl: suspend: common suspend callbacks take rc Ian Jackson
                                       ` (12 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

switch_logdirty_done used to take the value to pass to
libxl__xc_domain_saverestore_async_callback_done (ie, the return value
from the callback).  (This was mistakenly described as "ok" in the
prototype, but in the definition it is "broke" and all the call sites
passed 0 for success or -1 for error.)

Instead, make it take a libxl error code (rc).  Convert this to the
suspend callback value at the end.

No functional change in this patch.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_dom.c |   23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 820ec3a..f5c76f2 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -775,7 +775,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
-                                 libxl__domain_suspend_state *dss, int ok);
+                                 libxl__domain_suspend_state *dss, int rc);
 
 static void logdirty_init(libxl__logdirty_switch *lds)
 {
@@ -852,7 +852,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
  out:
     LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
     libxl__xs_transaction_abort(gc, &t);
-    switch_logdirty_done(egc,dss,-1);
+    switch_logdirty_done(egc,dss,rc);
 }
 
 static void domain_suspend_switch_qemu_xen_logdirty
@@ -900,7 +900,7 @@ static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR,"logdirty switch: wait for device model timed out");
-    switch_logdirty_done(egc,dss,-1);
+    switch_logdirty_done(egc,dss,ERROR_FAIL);
 }
 
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
@@ -952,17 +952,16 @@ static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch,
      */
     libxl__xs_transaction_abort(gc, &t);
 
-    if (!rc) {
-        switch_logdirty_done(egc,dss,0);
-    } else if (rc < 0) {
-        LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
-        switch_logdirty_done(egc,dss,-1);
+    if (rc <= 0) {
+        if (rc < 0)
+            LOG(ERROR,"logdirty switch: failed (rc=%d)",rc);
+        switch_logdirty_done(egc,dss,rc);
     }
 }
 
 static void switch_logdirty_done(libxl__egc *egc,
                                  libxl__domain_suspend_state *dss,
-                                 int broke)
+                                 int rc)
 {
     STATE_AO_GC(dss->ao);
     libxl__logdirty_switch *lds = &dss->logdirty;
@@ -970,6 +969,12 @@ static void switch_logdirty_done(libxl__egc *egc,
     libxl__ev_xswatch_deregister(gc, &lds->watch);
     libxl__ev_time_deregister(gc, &lds->timeout);
 
+    int broke;
+    if (rc) {
+        broke = -1;
+    } else {
+        broke = 0;
+    }
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, broke);
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/14] libxl: suspend: common suspend callbacks take rc
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
  2013-12-20 18:45                     ` [PATCH 01/14] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 03/14] libxl: suspend: Return correct error from callbacks Ian Jackson
                                       ` (11 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Change the following functions to take a libxl error code rather than
a boolean "ok" value, and translate that value to the boolean expected
by libxc at the last moment:
  domain_suspend_callback_common_done        } dss->callback_common_done
  remus_domain_suspend_callback_common_done  }
  domain_suspend_common_done

Also, abolish domain_suspend_common_failed as
domain_suspend_common_done can easily do its job and the call sites
now have to supply the right rc value anyway.

In domain_suspend_common_guest_suspended, change "ret" to "rc"
as it contains a libxl error code.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_dom.c |   51 ++++++++++++++++++++---------------------------
 1 file changed, 22 insertions(+), 29 deletions(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index f5c76f2..65a88c2 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -756,9 +756,9 @@ int libxl__toolstack_restore(uint32_t domid, const uint8_t *buf,
 static void domain_suspend_done(libxl__egc *egc,
                         libxl__domain_suspend_state *dss, int rc);
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
+                                libxl__domain_suspend_state *dss, int rc);
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok);
+                                libxl__domain_suspend_state *dss, int rc);
 
 /*----- complicated callback, called by xc_domain_save -----*/
 
@@ -1045,11 +1045,9 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
       libxl__ev_time *ev, const struct timeval *requested_abs);
 
-static void domain_suspend_common_failed(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss);
 static void domain_suspend_common_done(libxl__egc *egc,
                                        libxl__domain_suspend_state *dss,
-                                       bool ok);
+                                       int rc);
 
 static bool domain_suspend_pvcontrol_acked(const char *state) {
     /* any value other than "suspend", including ENOENT, is OK */
@@ -1079,6 +1077,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         ret = xc_evtchn_notify(CTX->xce, dss->guest_evtchn.port);
         if (ret < 0) {
             LOG(ERROR, "xc_evtchn_notify failed ret=%d", ret);
+            rc = ERROR_FAIL;
             goto err;
         }
 
@@ -1099,6 +1098,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         ret = xc_domain_shutdown(CTX->xch, domid, SHUTDOWN_suspend);
         if (ret < 0) {
             LOGE(ERROR, "xc_domain_shutdown failed");
+            rc = ERROR_FAIL;
             goto err;
         }
         /* The guest does not (need to) respond to this sort of request. */
@@ -1113,7 +1113,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
     libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend");
 
     dss->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid);
-    if (!dss->pvcontrol.path) goto err;
+    if (!dss->pvcontrol.path) { rc = ERROR_FAIL; goto err; }
 
     dss->pvcontrol.ao = ao;
     dss->pvcontrol.what = "guest acknowledgement of suspend request";
@@ -1123,7 +1123,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
     return;
 
  err:
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, rc);
 }
 
 static void domain_suspend_common_wait_guest_evtchn(libxl__egc *egc,
@@ -1198,7 +1198,7 @@ static void domain_suspend_common_pvcontrol_suspending(libxl__egc *egc,
 
  err:
     libxl__xs_transaction_abort(gc, &t);
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, rc);
     return;
 }
 
@@ -1222,7 +1222,7 @@ static void domain_suspend_common_wait_guest(libxl__egc *egc,
     return;
 
  err:
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, rc);
 }
 
 static void suspend_common_wait_guest_watch(libxl__egc *egc,
@@ -1247,7 +1247,6 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
     if (ret < 0) {
         LOGE(ERROR, "unable to check for status of guest %"PRId32"", domid);
         goto err;
-        domain_suspend_common_failed(egc, dss);
     }
 
     if (!(ret == 1 && info.domain == domid)) {
@@ -1273,7 +1272,7 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
     return;
 
  err:
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, ERROR_FAIL);
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
@@ -1282,46 +1281,40 @@ static void suspend_common_wait_guest_timeout(libxl__egc *egc,
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
     STATE_AO_GC(dss->ao);
     LOG(ERROR, "guest did not suspend, timed out");
-    domain_suspend_common_failed(egc, dss);
+    domain_suspend_common_done(egc, dss, ERROR_GUEST_TIMEDOUT);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
                                          libxl__domain_suspend_state *dss)
 {
     STATE_AO_GC(dss->ao);
-    int ret;
+    int rc;
 
     libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
     libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
     libxl__ev_time_deregister(gc, &dss->guest_timeout);
 
     if (dss->hvm) {
-        ret = libxl__domain_suspend_device_model(gc, dss);
-        if (ret) {
-            LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", ret);
-            domain_suspend_common_failed(egc, dss);
+        rc = libxl__domain_suspend_device_model(gc, dss);
+        if (rc) {
+            LOG(ERROR, "libxl__domain_suspend_device_model failed ret=%d", rc);
+            domain_suspend_common_done(egc, dss, rc);
             return;
         }
     }
-    domain_suspend_common_done(egc, dss, 1);
-}
-
-static void domain_suspend_common_failed(libxl__egc *egc,
-                                         libxl__domain_suspend_state *dss)
-{
     domain_suspend_common_done(egc, dss, 0);
 }
 
 static void domain_suspend_common_done(libxl__egc *egc,
                                        libxl__domain_suspend_state *dss,
-                                       bool ok)
+                                       int rc)
 {
     EGC_GC;
     assert(!libxl__xswait_inuse(&dss->pvcontrol));
     libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn);
     libxl__ev_xswatch_deregister(gc, &dss->guest_watch);
     libxl__ev_time_deregister(gc, &dss->guest_timeout);
-    dss->callback_common_done(egc, dss, ok);
+    dss->callback_common_done(egc, dss, rc);
 }
 
 static inline char *physmap_path(libxl__gc *gc, uint32_t domid,
@@ -1419,9 +1412,9 @@ static void libxl__domain_suspend_callback(void *data)
 }
 
 static void domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok)
+                                libxl__domain_suspend_state *dss, int rc)
 {
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }    
 
 /*----- remus callbacks -----*/
@@ -1437,10 +1430,10 @@ static void libxl__remus_domain_suspend_callback(void *data)
 }
 
 static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
-                                libxl__domain_suspend_state *dss, int ok)
+                                libxl__domain_suspend_state *dss, int rc)
 {
     /* REMUS TODO: Issue disk and network checkpoint reqs. */
-    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
+    libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }    
 
 static int libxl__remus_domain_resume_callback(void *data)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/14] libxl: suspend: Return correct error from callbacks
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
  2013-12-20 18:45                     ` [PATCH 01/14] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
  2013-12-20 18:45                     ` [PATCH 02/14] libxl: suspend: common suspend callbacks take rc Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 04/14] libxl: Use libxl__xswait* in libxl__ao_device Ian Jackson
                                       ` (10 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

If a suspend callback fails, it has a libxl error code in its hand.
However we must return to libxc the values that libxc expects.  So we
stash the libxl error code in dss->rc and fish it out again after
libxc returns from the suspend call.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_dom.c      |   14 +++++++++++++-
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 65a88c2..0fda9a4 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -869,6 +869,7 @@ static void domain_suspend_switch_qemu_xen_logdirty
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0);
     } else {
         LOG(ERROR,"logdirty switch failed (rc=%d), aborting suspend",rc);
+        dss->rc = rc;
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
     }
 }
@@ -891,6 +892,7 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     default:
         LOG(ERROR,"logdirty switch failed"
             ", no valid device model version found, aborting suspend");
+        dss->rc = ERROR_FAIL;
         libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1);
     }
 }
@@ -972,6 +974,7 @@ static void switch_logdirty_done(libxl__egc *egc,
     int broke;
     if (rc) {
         broke = -1;
+        dss->rc = rc;
     } else {
         broke = 0;
     }
@@ -1414,6 +1417,7 @@ static void libxl__domain_suspend_callback(void *data)
 static void domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dss, int rc)
 {
+    dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }    
 
@@ -1433,6 +1437,7 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
                                 libxl__domain_suspend_state *dss, int rc)
 {
     /* REMUS TODO: Issue disk and network checkpoint reqs. */
+    dss->rc = rc;
     libxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, !rc);
 }    
 
@@ -1441,10 +1446,14 @@ static int libxl__remus_domain_resume_callback(void *data)
     libxl__save_helper_state *shs = data;
     libxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
     STATE_AO_GC(dss->ao);
+    int rc;
 
     /* Resumes the domain and the device model */
-    if (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
+    rc = libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1);
+    if (rc) {
+        dss->rc = rc;
         return 0;
+    }
 
     /* REMUS TODO: Deal with disk. Start a new network output buffer */
     return 1;
@@ -1498,6 +1507,7 @@ void libxl__domain_suspend(libxl__egc *egc, libxl__domain_suspend_state *dss)
     libxl__srm_save_autogen_callbacks *const callbacks =
         &dss->shs.callbacks.save.a;
 
+    dss->rc = 0;
     logdirty_init(&dss->logdirty);
     libxl__xswait_init(&dss->pvcontrol);
     libxl__ev_evtchn_init(&dss->guest_evtchn);
@@ -1591,6 +1601,8 @@ void libxl__xc_domain_save_done(libxl__egc *egc, void *dss_void,
                          "domain did not respond to suspend request");
         if ( !dss->guest_responded )
             rc = ERROR_GUEST_TIMEDOUT;
+        else if (dss->rc)
+            rc = dss->rc;
         else
             rc = ERROR_FAIL;
         goto out;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 7c98d5b..1ad2516 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2404,6 +2404,7 @@ struct libxl__domain_suspend_state {
     int debug;
     const libxl_domain_remus_info *remus;
     /* private */
+    int rc;
     libxl__ev_evtchn guest_evtchn;
     int guest_evtchn_lockfd;
     int hvm;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/14] libxl: Use libxl__xswait* in libxl__ao_device
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (2 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 03/14] libxl: suspend: Return correct error from callbacks Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 05/14] libxl: xswait/devstate: Move xswait to before devstate Ian Jackson
                                       ` (9 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Replace the separate timeout and xenstore watch with use of
libxl__xswait*.

Different control flow, but no ultimate functional change apart from
slight changes to the text of error messages.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_device.c   |   64 ++++++++++++------------------------------
 tools/libxl/libxl_internal.h |    4 +--
 2 files changed, 20 insertions(+), 48 deletions(-)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index ba7d100..c1609a7 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -436,7 +436,7 @@ void libxl__prepare_ao_device(libxl__ao *ao, libxl__ao_device *aodev)
      * Initialize xs_watch, because it's not used on all possible
      * execution paths, but it's unconditionally destroyed when finished.
      */
-    libxl__ev_xswatch_init(&aodev->xs_watch);
+    libxl__xswait_init(&aodev->xswait);
     aodev->active = 1;
     /* We init this here because we might call device_hotplug_done
      * without actually calling any hotplug script */
@@ -714,13 +714,9 @@ static void device_hotplug_child_death_cb(libxl__egc *egc,
                                           libxl__ev_child *child,
                                           pid_t pid, int status);
 
-static void device_destroy_be_timeout_cb(libxl__egc *egc, libxl__ev_time *ev,
-                                         const struct timeval *requested_abs);
-
 static void device_destroy_be_watch_cb(libxl__egc *egc,
-                                       libxl__ev_xswatch *watch,
-                                       const char *watch_path,
-                                       const char *event_path);
+                                       libxl__xswait_state *xswait,
+                                       int rc, const char *data);
 
 static void device_hotplug_done(libxl__egc *egc, libxl__ao_device *aodev);
 
@@ -970,22 +966,14 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
         if (aodev->action != LIBXL__DEVICE_ACTION_REMOVE)
             goto out;
 
-        rc = libxl__ev_time_register_rel(gc, &aodev->timeout,
-                                         device_destroy_be_timeout_cb,
-                                         LIBXL_DESTROY_TIMEOUT * 1000);
-        if (rc) {
-            LOG(ERROR, "setup of xs watch timeout failed");
-            goto out;
-        }
-
-        rc = libxl__ev_xswatch_register(gc, &aodev->xs_watch,
-                                        device_destroy_be_watch_cb,
-                                        be_path);
-        if (rc) {
-            LOG(ERROR, "setup of xs watch for %s failed", be_path);
-            libxl__ev_time_deregister(gc, &aodev->timeout);
+        aodev->xswait.ao = ao;
+        aodev->xswait.what = "removal of backend path";
+        aodev->xswait.path = be_path;
+        aodev->xswait.timeout_ms = LIBXL_DESTROY_TIMEOUT * 1000;
+        aodev->xswait.callback = device_destroy_be_watch_cb;
+        rc = libxl__xswait_start(gc, &aodev->xswait);
+        if (rc)
             goto out;
-        }
         return;
     }
 
@@ -1105,37 +1093,21 @@ error:
     device_hotplug_done(egc, aodev);
 }
 
-static void device_destroy_be_timeout_cb(libxl__egc *egc, libxl__ev_time *ev,
-                                         const struct timeval *requested_abs)
-{
-    libxl__ao_device *aodev = CONTAINER_OF(ev, *aodev, timeout);
-    STATE_AO_GC(aodev->ao);
-
-    LOG(ERROR, "timed out while waiting for %s to be removed",
-               libxl__device_backend_path(gc, aodev->dev));
-
-    aodev->rc = ERROR_TIMEDOUT;
-
-    device_hotplug_done(egc, aodev);
-    return;
-}
-
 static void device_destroy_be_watch_cb(libxl__egc *egc,
-                                       libxl__ev_xswatch *watch,
-                                       const char *watch_path,
-                                       const char *event_path)
+                                       libxl__xswait_state *xswait,
+                                       int rc, const char *dir)
 {
-    libxl__ao_device *aodev = CONTAINER_OF(watch, *aodev, xs_watch);
+    libxl__ao_device *aodev = CONTAINER_OF(xswait, *aodev, xswait);
     STATE_AO_GC(aodev->ao);
-    const char *dir;
-    int rc;
 
-    rc = libxl__xs_read_checked(gc, XBT_NULL, watch_path, &dir);
     if (rc) {
-        LOG(ERROR, "unable to read backend path: %s", watch_path);
+        if (rc == ERROR_TIMEDOUT)
+            LOG(ERROR, "timed out while waiting for %s to be removed",
+                xswait->path);
         aodev->rc = rc;
         goto out;
     }
+
     if (dir) {
         /* backend path still exists, wait a little longer... */
         return;
@@ -1168,7 +1140,7 @@ static void device_hotplug_clean(libxl__gc *gc, libxl__ao_device *aodev)
 {
     /* Clean events and check reentrancy */
     libxl__ev_time_deregister(gc, &aodev->timeout);
-    libxl__ev_xswatch_deregister(gc, &aodev->xs_watch);
+    libxl__xswait_stop(gc, &aodev->xswait);
     assert(!libxl__ev_child_inuse(&aodev->child));
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 1ad2516..ffafd25 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2025,10 +2025,10 @@ struct libxl__ao_device {
     libxl__multidev *multidev; /* reference to the containing multidev */
     /* private for add/remove implementation */
     libxl__ev_devstate backend_ds;
-    /* Bodge for Qemu devices, also used for timeout of hotplug execution */
+    /* Bodge for Qemu devices */
     libxl__ev_time timeout;
     /* xenstore watch for backend path of driver domains */
-    libxl__ev_xswatch xs_watch;
+    libxl__xswait_state xswait;
     /* device hotplug execution */
     const char *what;
     int num_exec;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/14] libxl: xswait/devstate: Move xswait to before devstate
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (3 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 04/14] libxl: Use libxl__xswait* in libxl__ao_device Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 06/14] libxl: devstate: Use libxl__xswait* Ian Jackson
                                       ` (8 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Pure code motion.  We are going to make devstate use xswait.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_internal.h |  105 +++++++++++++++++++++---------------------
 1 file changed, 53 insertions(+), 52 deletions(-)

diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index ffafd25..07283c9 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1030,6 +1030,59 @@ _hidden const char *libxl__device_nic_devname(libxl__gc *gc,
 
 _hidden int libxl__get_domid(libxl__gc *gc, uint32_t *domid);
 
+/*----- xswait: wait for a xenstore node to be suitable -----*/
+
+typedef struct libxl__xswait_state libxl__xswait_state;
+
+/*
+ * rc describes the circumstances of this callback:
+ *
+ * rc==0
+ *
+ *     The xenstore path (may have) changed.  It has been read for
+ *     you.  The result is in data (allocated from the ao gc).
+ *     data may be NULL, which means that the xenstore read gave
+ *     ENOENT.
+ *
+ *     If you are satisfied, you MUST call libxl__xswait_stop.
+ *     Otherwise, xswait will continue waiting and watching and
+ *     will call you back later.
+ *
+ * rc==ERROR_TIMEDOUT
+ *
+ *     The specified timeout was reached.
+ *     This has NOT been logged (except to the debug log).
+ *     xswait will not continue (but calling libxl__xswait_stop is OK).
+ *
+ * rc!=0, !=ERROR_TIMEDOUT
+ *
+ *     Some other error occurred.
+ *     This HAS been logged.
+ *     xswait will not continue (but calling libxl__xswait_stop is OK).
+ *
+ */
+typedef void libxl__xswait_callback(libxl__egc *egc,
+      libxl__xswait_state *xswa, int rc, const char *data);
+
+struct libxl__xswait_state {
+    /* caller must fill these in, and they must all remain valid */
+    libxl__ao *ao;
+    const char *what; /* for error msgs: noun phrase, what we're waiting for */
+    const char *path;
+    int timeout_ms; /* as for poll(2) */
+    libxl__xswait_callback *callback;
+    /* remaining fields are private to xswait */
+    libxl__ev_time time_ev;
+    libxl__ev_xswatch watch_ev;
+};
+
+void libxl__xswait_init(libxl__xswait_state*);
+void libxl__xswait_stop(libxl__gc*, libxl__xswait_state*); /*idempotent*/
+bool libxl__xswait_inuse(const libxl__xswait_state *ss);
+
+int libxl__xswait_start(libxl__gc*, libxl__xswait_state*);
+
+
 /*
  * libxl__ev_devstate - waits a given time for a device to
  * reach a given state.  Follows the libxl_ev_* conventions.
@@ -1117,58 +1170,6 @@ _hidden int libxl__create_pci_backend(libxl__gc *gc, uint32_t domid,
                                       libxl_device_pci *pcidev, int num);
 _hidden int libxl__device_pci_destroy_all(libxl__gc *gc, uint32_t domid);
 
-/*----- xswait: wait for a xenstore node to be suitable -----*/
-
-typedef struct libxl__xswait_state libxl__xswait_state;
-
-/*
- * rc describes the circumstances of this callback:
- *
- * rc==0
- *
- *     The xenstore path (may have) changed.  It has been read for
- *     you.  The result is in data (allocated from the ao gc).
- *     data may be NULL, which means that the xenstore read gave
- *     ENOENT.
- *
- *     If you are satisfied, you MUST call libxl__xswait_stop.
- *     Otherwise, xswait will continue waiting and watching and
- *     will call you back later.
- *
- * rc==ERROR_TIMEDOUT
- *
- *     The specified timeout was reached.
- *     This has NOT been logged (except to the debug log).
- *     xswait will not continue (but calling libxl__xswait_stop is OK).
- *
- * rc!=0, !=ERROR_TIMEDOUT
- *
- *     Some other error occurred.
- *     This HAS been logged.
- *     xswait will not continue (but calling libxl__xswait_stop is OK).
- *
- */
-typedef void libxl__xswait_callback(libxl__egc *egc,
-      libxl__xswait_state *xswa, int rc, const char *data);
-
-struct libxl__xswait_state {
-    /* caller must fill these in, and they must all remain valid */
-    libxl__ao *ao;
-    const char *what; /* for error msgs: noun phrase, what we're waiting for */
-    const char *path;
-    int timeout_ms; /* as for poll(2) */
-    libxl__xswait_callback *callback;
-    /* remaining fields are private to xswait */
-    libxl__ev_time time_ev;
-    libxl__ev_xswatch watch_ev;
-};
-
-void libxl__xswait_init(libxl__xswait_state*);
-void libxl__xswait_stop(libxl__gc*, libxl__xswait_state*); /*idempotent*/
-bool libxl__xswait_inuse(const libxl__xswait_state *ss);
-
-int libxl__xswait_start(libxl__gc*, libxl__xswait_state*);
-
 /*
  *----- spawn -----
  *
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/14] libxl: devstate: Use libxl__xswait*
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (4 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 05/14] libxl: xswait/devstate: Move xswait to before devstate Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 07/14] libxl: New error codes CANCELLED etc Ian Jackson
                                       ` (7 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_device.c   |    4 +--
 tools/libxl/libxl_event.c    |   78 ++++++++++++++++++------------------------
 tools/libxl/libxl_internal.h |   11 +++---
 3 files changed, 40 insertions(+), 53 deletions(-)

diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index c1609a7..2ea95d6 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -741,7 +741,7 @@ void libxl__wait_device_connection(libxl__egc *egc, libxl__ao_device *aodev)
         return;
     }
 
-    rc = libxl__ev_devstate_wait(gc, &aodev->backend_ds,
+    rc = libxl__ev_devstate_wait(ao, &aodev->backend_ds,
                                  device_backend_callback,
                                  state_path, XenbusStateInitWait,
                                  LIBXL_INIT_TIMEOUT * 1000);
@@ -841,7 +841,7 @@ void libxl__initiate_device_remove(libxl__egc *egc,
         if (rc < 0) goto out;
     }
 
-    rc = libxl__ev_devstate_wait(gc, &aodev->backend_ds,
+    rc = libxl__ev_devstate_wait(ao, &aodev->backend_ds,
                                  device_backend_callback,
                                  state_path, XenbusStateClosed,
                                  LIBXL_DESTROY_TIMEOUT * 1000);
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 2d9366b..696c744 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -770,68 +770,58 @@ void libxl__ev_evtchn_cancel(libxl__gc *gc, libxl__ev_evtchn *evev)
  * waiting for device state
  */
 
-static void devstate_watch_callback(libxl__egc *egc, libxl__ev_xswatch *watch,
-                                const char *watch_path, const char *event_path)
+static void devstate_callback(libxl__egc *egc, libxl__xswait_state *xsw,
+                              int rc, const char *sstate)
 {
     EGC_GC;
-    libxl__ev_devstate *ds = CONTAINER_OF(watch, *ds, watch);
-    int rc;
+    libxl__ev_devstate *ds = CONTAINER_OF(xsw, *ds, w);
 
-    char *sstate = libxl__xs_read(gc, XBT_NULL, watch_path);
+    if (rc) {
+        if (rc == ERROR_TIMEDOUT)
+            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d "
+                       " timed out", ds->w.path, ds->wanted);
+        goto out;
+    }
     if (!sstate) {
-        if (errno == ENOENT) {
-            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
-                       " but it was removed", watch_path, ds->wanted);
-            rc = ERROR_INVAL;
-        } else {
-            LIBXL__LOG_ERRNO(CTX, LIBXL__LOG_ERROR, "backend %s wanted state"
-                             " %d but read failed", watch_path, ds->wanted);
-            rc = ERROR_FAIL;
-        }
+        LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
+                   " but it was removed", ds->w.path, ds->wanted);
+        rc = ERROR_INVAL;
+        goto out;
+    }
+
+    int got = atoi(sstate);
+    if (got == ds->wanted) {
+        LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d ok",
+                   ds->w.path, ds->wanted);
+        rc = 0;
     } else {
-        int got = atoi(sstate);
-        if (got == ds->wanted) {
-            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d ok",
-                       watch_path, ds->wanted);
-            rc = 0;
-        } else {
-            LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
-                       " still waiting state %d", watch_path, ds->wanted, got);
-            return;
-        }
+        LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d"
+                   " still waiting state %d", ds->w.path, ds->wanted, got);
+        return;
     }
-    libxl__ev_devstate_cancel(gc, ds);
-    ds->callback(egc, ds, rc);
-}
 
-static void devstate_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                             const struct timeval *requested_abs)
-{
-    EGC_GC;
-    libxl__ev_devstate *ds = CONTAINER_OF(ev, *ds, timeout);
-    LIBXL__LOG(CTX, LIBXL__LOG_DEBUG, "backend %s wanted state %d "
-               " timed out", ds->watch.path, ds->wanted);
+ out:
     libxl__ev_devstate_cancel(gc, ds);
-    ds->callback(egc, ds, ERROR_TIMEDOUT);
+    ds->callback(egc, ds, rc);
 }
 
-int libxl__ev_devstate_wait(libxl__gc *gc, libxl__ev_devstate *ds,
+int libxl__ev_devstate_wait(libxl__ao *ao, libxl__ev_devstate *ds,
                             libxl__ev_devstate_callback cb,
                             const char *state_path, int state, int milliseconds)
 {
+    AO_GC;
     int rc;
 
-    libxl__ev_time_init(&ds->timeout);
-    libxl__ev_xswatch_init(&ds->watch);
+    libxl__xswait_init(&ds->w);
     ds->wanted = state;
     ds->callback = cb;
 
-    rc = libxl__ev_time_register_rel(gc, &ds->timeout, devstate_timeout,
-                                     milliseconds);
-    if (rc) goto out;
-
-    rc = libxl__ev_xswatch_register(gc, &ds->watch, devstate_watch_callback,
-                                    state_path);
+    ds->w.what = GCSPRINTF("backend %s (hoping for state change to %d)",
+                           state_path, state);
+    ds->w.path = state_path;
+    ds->w.timeout_ms = milliseconds;
+    ds->w.callback = devstate_callback;
+    rc = libxl__xswait_start(gc, &ds->w);
     if (rc) goto out;
 
     return 0;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 07283c9..af41200 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1101,24 +1101,21 @@ struct libxl__ev_devstate {
     libxl__ev_devstate_callback *callback;
     /* as for the remainder, read-only public parts may also be
      * read by the caller (notably, watch.path), but only when waiting: */
-    libxl__ev_xswatch watch;
-    libxl__ev_time timeout;
+    libxl__xswait_state w;
 };
 
 static inline void libxl__ev_devstate_init(libxl__ev_devstate *ds)
 {
-    libxl__ev_time_init(&ds->timeout);
-    libxl__ev_xswatch_init(&ds->watch);
+    libxl__xswait_init(&ds->w);
 }
 
 static inline void libxl__ev_devstate_cancel(libxl__gc *gc,
                                              libxl__ev_devstate *ds)
 {
-    libxl__ev_time_deregister(gc,&ds->timeout);
-    libxl__ev_xswatch_deregister(gc,&ds->watch);
+    libxl__xswait_stop(gc,&ds->w);
 }
 
-_hidden int libxl__ev_devstate_wait(libxl__gc *gc, libxl__ev_devstate *ds,
+_hidden int libxl__ev_devstate_wait(libxl__ao *ao, libxl__ev_devstate *ds,
                                     libxl__ev_devstate_callback cb,
                                     const char *state_path,
                                     int state, int milliseconds);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/14] libxl: New error codes CANCELLED etc.
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (5 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 06/14] libxl: devstate: Use libxl__xswait* Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 08/14] libxl: events: Permit timeouts to signal cancellation Ian Jackson
                                       ` (6 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We introduce ERROR_CANCELLED now, so that we can write code to handle
it, and decreee that functions might return it, even though currently
there is nowhere where this error is generated.

While we're here, provide ERROR_NOTFOUND and ERROR_NOTIMPLEMENTED,
which will also be used later, but only as part of the public API.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>

squash! libxl: New error code CANCELLED
---
 tools/libxl/libxl_types.idl |    3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 649ce50..9466bf9 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -43,6 +43,9 @@ libxl_error = Enumeration("error", [
     (-12, "OSEVENT_REG_FAIL"),
     (-13, "BUFFERFULL"),
     (-14, "UNKNOWN_CHILD"),
+    (-15, "CANCELLED"),
+    (-16, "NOTFOUND"),
+    (-17, "NOTIMPLEMENTED"),
     ], value_namespace = "")
 
 libxl_domain_type = Enumeration("domain_type", [
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/14] libxl: events: Permit timeouts to signal cancellation
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (6 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 07/14] libxl: New error codes CANCELLED etc Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 09/14] libxl: domain create: Do not destroy on cancellation Ian Jackson
                                       ` (5 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

We make these semantic changes to the interface to libxl__ev_time:

 * The callback functions provided by users must take an rc value.
   This rc value can be ERROR_TIMEDOUT or ERROR_CANCELLED.

 * The setup functions take a libxl__ao, not a libxl__gc.  This means
   that timeouts can only occur as part of a long-running libxl
   function (but this is of course correct, as libxl shouldn't have
   any global timeouts, and indeed all the call sites have an ao).

The consequential changes at the points where timeouts are used are
generally obvious.  However:

 * device_hotplug_timeout_cb/device_hotplug_child_death_cb need to
   take a bit more care to preserve the error code passed to the
   timeout.

 * Users of xswait are now expected to deal correctly with
   ERROR_CANCELLED.  If they experience this, it hasn't been logged.
   And the caller won't log it either since it's not TIMEDOUT.
   Luckily this is correct, so we can just change the doc comment.

Currently nothing generates ERROR_CANCELLED; in particular the
timeouts cannot in fact signal cancellation.

There should be no publicly visible change except that some error
returns from libxl will change from ERROR_FAIL to ERROR_TIMEDOUT, and
some changes to debugging messages.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_aoutils.c  |    7 ++++---
 tools/libxl/libxl_device.c   |   24 ++++++++++++++++--------
 tools/libxl/libxl_dom.c      |   23 ++++++++++++++---------
 tools/libxl/libxl_event.c    |   10 ++++++----
 tools/libxl/libxl_internal.h |   16 +++++++++-------
 5 files changed, 49 insertions(+), 31 deletions(-)

diff --git a/tools/libxl/libxl_aoutils.c b/tools/libxl/libxl_aoutils.c
index 477717b..34a3305 100644
--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -46,7 +46,7 @@ int libxl__xswait_start(libxl__gc *gc, libxl__xswait_state *xswa)
 {
     int rc;
 
-    rc = libxl__ev_time_register_rel(gc, &xswa->time_ev,
+    rc = libxl__ev_time_register_rel(xswa->ao, &xswa->time_ev,
                                      xswait_timeout_callback, xswa->timeout_ms);
     if (rc) goto err;
 
@@ -76,12 +76,13 @@ void xswait_xswatch_callback(libxl__egc *egc, libxl__ev_xswatch *xsw,
 }
 
 void xswait_timeout_callback(libxl__egc *egc, libxl__ev_time *ev,
-                             const struct timeval *requested_abs)
+                             const struct timeval *requested_abs,
+                             int rc)
 {
     EGC_GC;
     libxl__xswait_state *xswa = CONTAINER_OF(ev, *xswa, time_ev);
     LOG(DEBUG, "%s: xswait timeout (path=%s)", xswa->what, xswa->path);
-    xswait_report_error(egc, xswa, ERROR_TIMEDOUT);
+    xswait_report_error(egc, xswa, rc);
 }
 
 static void xswait_report_error(libxl__egc *egc, libxl__xswait_state *xswa,
diff --git a/tools/libxl/libxl_device.c b/tools/libxl/libxl_device.c
index 2ea95d6..a795671 100644
--- a/tools/libxl/libxl_device.c
+++ b/tools/libxl/libxl_device.c
@@ -697,7 +697,7 @@ out:
 
 /* This callback is part of the Qemu devices Badge */
 static void device_qemu_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                const struct timeval *requested_abs);
+                                const struct timeval *requested_abs, int rc);
 
 static void device_backend_callback(libxl__egc *egc, libxl__ev_devstate *ds,
                                    int rc);
@@ -708,7 +708,8 @@ static void device_backend_cleanup(libxl__gc *gc,
 static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev);
 
 static void device_hotplug_timeout_cb(libxl__egc *egc, libxl__ev_time *ev,
-                                      const struct timeval *requested_abs);
+                                      const struct timeval *requested_abs,
+                                      int rc);
 
 static void device_hotplug_child_death_cb(libxl__egc *egc,
                                           libxl__ev_child *child,
@@ -790,7 +791,7 @@ void libxl__initiate_device_remove(libxl__egc *egc,
              * TODO: 4.2 Bodge due to QEMU, see comment on top of
              * libxl__initiate_device_remove in libxl_internal.h
              */
-            rc = libxl__ev_time_register_rel(gc, &aodev->timeout,
+            rc = libxl__ev_time_register_rel(ao, &aodev->timeout,
                                              device_qemu_timeout,
                                              LIBXL_QEMU_BODGE_TIMEOUT * 1000);
             if (rc) {
@@ -862,7 +863,7 @@ out:
 }
 
 static void device_qemu_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                const struct timeval *requested_abs)
+                                const struct timeval *requested_abs, int rc)
 {
     libxl__ao_device *aodev = CONTAINER_OF(ev, *aodev, timeout);
     STATE_AO_GC(aodev->ao);
@@ -870,7 +871,9 @@ static void device_qemu_timeout(libxl__egc *egc, libxl__ev_time *ev,
     char *state_path = GCSPRINTF("%s/state", be_path);
     const char *xs_state;
     xs_transaction_t t = 0;
-    int rc = 0;
+
+    if (rc != ERROR_TIMEDOUT)
+        goto out;
 
     libxl__ev_time_deregister(gc, &aodev->timeout);
 
@@ -998,7 +1001,7 @@ static void device_hotplug(libxl__egc *egc, libxl__ao_device *aodev)
     }
 
     /* Set hotplug timeout */
-    rc = libxl__ev_time_register_rel(gc, &aodev->timeout,
+    rc = libxl__ev_time_register_rel(ao, &aodev->timeout,
                                      device_hotplug_timeout_cb,
                                      LIBXL_HOTPLUG_TIMEOUT * 1000);
     if (rc) {
@@ -1035,11 +1038,15 @@ out:
 }
 
 static void device_hotplug_timeout_cb(libxl__egc *egc, libxl__ev_time *ev,
-                                      const struct timeval *requested_abs)
+                                      const struct timeval *requested_abs,
+                                      int rc)
 {
     libxl__ao_device *aodev = CONTAINER_OF(ev, *aodev, timeout);
     STATE_AO_GC(aodev->ao);
 
+    if (!aodev->rc)
+        aodev->rc = rc;
+
     libxl__ev_time_deregister(gc, &aodev->timeout);
 
     assert(libxl__ev_child_inuse(&aodev->child));
@@ -1070,7 +1077,8 @@ static void device_hotplug_child_death_cb(libxl__egc *egc,
                                        GCSPRINTF("%s/hotplug-error", be_path));
         if (hotplug_error)
             LOG(ERROR, "script: %s", hotplug_error);
-        aodev->rc = ERROR_FAIL;
+        if (!aodev->rc)
+            aodev->rc = ERROR_FAIL;
         if (aodev->action == LIBXL__DEVICE_ACTION_ADD)
             /*
              * Only fail on device connection, on disconnection
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 0fda9a4..1b1d06f 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -771,7 +771,8 @@ static void remus_domain_suspend_callback_common_done(libxl__egc *egc,
  */
 
 static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs);
+                                    const struct timeval *requested_abs,
+                                    int rc);
 static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*,
                             const char *watch_path, const char *event_path);
 static void switch_logdirty_done(libxl__egc *egc,
@@ -808,7 +809,7 @@ static void domain_suspend_switch_qemu_xen_traditional_logdirty
                                 switch_logdirty_xswatch, lds->ret_path);
     if (rc) goto out;
 
-    rc = libxl__ev_time_register_rel(gc, &lds->timeout,
+    rc = libxl__ev_time_register_rel(ao, &lds->timeout,
                                 switch_logdirty_timeout, 10*1000);
     if (rc) goto out;
 
@@ -897,7 +898,8 @@ void libxl__domain_suspend_common_switch_qemu_logdirty
     }
 }
 static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev,
-                                    const struct timeval *requested_abs)
+                                    const struct timeval *requested_abs,
+                                    int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout);
     STATE_AO_GC(dss->ao);
@@ -1046,7 +1048,7 @@ static void suspend_common_wait_guest_watch(libxl__egc *egc,
 static void suspend_common_wait_guest_check(libxl__egc *egc,
         libxl__domain_suspend_state *dss);
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-      libxl__ev_time *ev, const struct timeval *requested_abs);
+      libxl__ev_time *ev, const struct timeval *requested_abs, int rc);
 
 static void domain_suspend_common_done(libxl__egc *egc,
                                        libxl__domain_suspend_state *dss,
@@ -1088,7 +1090,7 @@ static void domain_suspend_callback_common(libxl__egc *egc,
         rc = libxl__ev_evtchn_wait(gc, &dss->guest_evtchn);
         if (rc) goto err;
 
-        rc = libxl__ev_time_register_rel(gc, &dss->guest_timeout,
+        rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout,
                                          suspend_common_wait_guest_timeout,
                                          60*1000);
         if (rc) goto err;
@@ -1218,7 +1220,7 @@ static void domain_suspend_common_wait_guest(libxl__egc *egc,
                                     "@releaseDomain");
     if (rc) goto err;
 
-    rc = libxl__ev_time_register_rel(gc, &dss->guest_timeout, 
+    rc = libxl__ev_time_register_rel(ao, &dss->guest_timeout, 
                                      suspend_common_wait_guest_timeout,
                                      60*1000);
     if (rc) goto err;
@@ -1279,12 +1281,15 @@ static void suspend_common_wait_guest_check(libxl__egc *egc,
 }
 
 static void suspend_common_wait_guest_timeout(libxl__egc *egc,
-      libxl__ev_time *ev, const struct timeval *requested_abs)
+      libxl__ev_time *ev, const struct timeval *requested_abs, int rc)
 {
     libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, guest_timeout);
     STATE_AO_GC(dss->ao);
-    LOG(ERROR, "guest did not suspend, timed out");
-    domain_suspend_common_done(egc, dss, ERROR_GUEST_TIMEDOUT);
+    if (rc == ERROR_TIMEDOUT) {
+        LOG(ERROR, "guest did not suspend, timed out");
+        rc = ERROR_GUEST_TIMEDOUT;
+    }
+    domain_suspend_common_done(egc, dss, rc);
 }
 
 static void domain_suspend_common_guest_suspended(libxl__egc *egc,
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 696c744..28b134a 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -309,10 +309,11 @@ static void time_done_debug(libxl__gc *gc, const char *func,
 #endif
 }
 
-int libxl__ev_time_register_abs(libxl__gc *gc, libxl__ev_time *ev,
+int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
                                 libxl__ev_time_callback *func,
                                 struct timeval absolute)
 {
+    AO_GC;
     int rc;
 
     CTX_LOCK;
@@ -333,10 +334,11 @@ int libxl__ev_time_register_abs(libxl__gc *gc, libxl__ev_time *ev,
 }
 
 
-int libxl__ev_time_register_rel(libxl__gc *gc, libxl__ev_time *ev,
+int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
                                 libxl__ev_time_callback *func,
                                 int milliseconds /* as for poll(2) */)
 {
+    AO_GC;
     struct timeval absolute;
     int rc;
 
@@ -1154,7 +1156,7 @@ static void afterpoll_internal(libxl__egc *egc, libxl__poller *poller,
             etime, (unsigned long)etime->abs.tv_sec,
             (unsigned long)etime->abs.tv_usec);
 
-        etime->func(egc, etime, &etime->abs);
+        etime->func(egc, etime, &etime->abs, ERROR_TIMEDOUT);
     }
 }
 
@@ -1235,7 +1237,7 @@ void libxl_osevent_occurred_timeout(libxl_ctx *ctx, void *for_libxl)
     assert(!ev->infinite);
 
     LIBXL_TAILQ_REMOVE(&CTX->etimes, ev, entry);
-    ev->func(egc, ev, &ev->abs);
+    ev->func(egc, ev, &ev->abs, ERROR_TIMEDOUT);
 
  out:
     CTX_UNLOCK;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index af41200..b2109ea 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -172,7 +172,8 @@ struct libxl__ev_fd {
 
 typedef struct libxl__ev_time libxl__ev_time;
 typedef void libxl__ev_time_callback(libxl__egc *egc, libxl__ev_time *ev,
-                                     const struct timeval *requested_abs);
+                                     const struct timeval *requested_abs,
+                                     int rc); /* TIMEDOUT or CANCELLED */
 struct libxl__ev_time {
     /* caller should include this in their own struct */
     /* read-only for caller, who may read only when registered: */
@@ -734,10 +735,10 @@ static inline void libxl__ev_fd_init(libxl__ev_fd *efd)
 static inline int libxl__ev_fd_isregistered(const libxl__ev_fd *efd)
                     { return efd->fd >= 0; }
 
-_hidden int libxl__ev_time_register_rel(libxl__gc*, libxl__ev_time *ev_out,
+_hidden int libxl__ev_time_register_rel(libxl__ao*, libxl__ev_time *ev_out,
                                         libxl__ev_time_callback*,
                                         int milliseconds /* as for poll(2) */);
-_hidden int libxl__ev_time_register_abs(libxl__gc*, libxl__ev_time *ev_out,
+_hidden int libxl__ev_time_register_abs(libxl__ao*, libxl__ev_time *ev_out,
                                         libxl__ev_time_callback*,
                                         struct timeval);
 _hidden int libxl__ev_time_modify_rel(libxl__gc*, libxl__ev_time *ev,
@@ -1048,13 +1049,13 @@ typedef struct libxl__xswait_state libxl__xswait_state;
  *     Otherwise, xswait will continue waiting and watching and
  *     will call you back later.
  *
- * rc==ERROR_TIMEDOUT
+ * rc==ERROR_TIMEDOUT, rc==ERROR_CANCELLED
  *
  *     The specified timeout was reached.
  *     This has NOT been logged (except to the debug log).
  *     xswait will not continue (but calling libxl__xswait_stop is OK).
  *
- * rc!=0, !=ERROR_TIMEDOUT
+ * rc!=0, !=ERROR_TIMEDOUT, !=ERROR_CANCELLED
  *
  *     Some other error occurred.
  *     This HAS been logged.
@@ -1092,8 +1093,9 @@ int libxl__xswait_start(libxl__gc*, libxl__xswait_state*);
 typedef struct libxl__ev_devstate libxl__ev_devstate;
 typedef void libxl__ev_devstate_callback(libxl__egc *egc, libxl__ev_devstate*,
                                          int rc);
-  /* rc will be 0, ERROR_TIMEDOUT, ERROR_INVAL (meaning path was removed),
-   * or ERROR_FAIL if other stuff went wrong (in which latter case, logged) */
+  /* rc will be 0, ERROR_TIMEDOUT, ERROR_CANCELLED, ERROR_INVAL
+   * (meaning path was removed), or ERROR_FAIL if other stuff went
+   * wrong (in which latter case, logged) */
 
 struct libxl__ev_devstate {
     /* read-only for caller, who may read only when waiting: */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/14] libxl: domain create: Do not destroy on cancellation
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (7 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 08/14] libxl: events: Permit timeouts to signal cancellation Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 10/14] libxl: ao: Record ultimate parent of a nested ao Ian Jackson
                                       ` (4 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

If we cancelled the domain creation, do not try to tear it down again
Document this.

This is a backwards-compatible API change since old libxl users will
never cancel any operations.

In the current code, there is no functional change, because
ERROR_CANCELLED is never generated anywhere yet.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl.h        |    4 ++++
 tools/libxl/libxl_create.c |    2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 12d6c31..784226b 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -607,6 +607,10 @@ int libxl_ctx_free(libxl_ctx *ctx /* 0 is OK */);
 
 /* domain related functions */
 
+/* If the result is ERROR_CANCELLED, the domain may or may not exist
+ * (in a half-created state).  *domid will be valid and will be the
+ * domain id, or -1, as appropriate */
+
 int libxl_domain_create_new(libxl_ctx *ctx, libxl_domain_config *d_config,
                             uint32_t *domid,
                             const libxl_asyncop_how *ao_how,
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e03bb55..c223771 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1262,7 +1262,7 @@ static void domcreate_complete(libxl__egc *egc,
     if (!rc && d_config->b_info.exec_ssidref)
         rc = xc_flask_relabel_domain(CTX->xch, dcs->guest_domid, d_config->b_info.exec_ssidref);
 
-    if (rc) {
+    if (rc && rc != ERROR_CANCELLED) {
         if (dcs->guest_domid) {
             dcs->dds.ao = ao;
             dcs->dds.domid = dcs->guest_domid;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/14] libxl: ao: Record ultimate parent of a nested ao
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (8 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 09/14] libxl: domain create: Do not destroy on cancellation Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 11/14] libxl: ao: Count the nested progeny of an ao Ian Jackson
                                       ` (3 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

This will be used by the cancellation machinery.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    |   25 +++++++++++++++----------
 tools/libxl/libxl_internal.h |    3 ++-
 2 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 28b134a..8eb3483 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -31,6 +31,9 @@
 #define DBG(args, ...) LIBXL__DBG_LOG(CTX, args, __VA_ARGS__)
 
 
+static libxl__ao *ao_nested_root(libxl__ao *ao);
+
+
 /*
  * The counter osevent_in_hook is used to ensure that the application
  * honours the reentrancy restriction documented in libxl_event.h.
@@ -1700,7 +1703,7 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc)
     LOG(DEBUG,"ao %p: complete, rc=%d",ao,rc);
     assert(ao->magic == LIBXL__AO_MAGIC);
     assert(!ao->complete);
-    assert(!ao->nested);
+    assert(!ao->nested_root);
     ao->complete = 1;
     ao->rc = rc;
 
@@ -1865,7 +1868,7 @@ void libxl__ao_progress_report(libxl__egc *egc, libxl__ao *ao,
         const libxl_asyncprogress_how *how, libxl_event *ev)
 {
     AO_GC;
-    assert(!ao->nested);
+    assert(!ao->nested_root);
     if (how->callback == dummy_asyncprogress_callback_ignore) {
         LOG(DEBUG,"ao %p: progress report: ignored",ao);
         libxl_event_free(CTX,ev);
@@ -1888,21 +1891,23 @@ void libxl__ao_progress_report(libxl__egc *egc, libxl__ao *ao,
 
 /* nested ao */
 
+static libxl__ao *ao_nested_root(libxl__ao *ao) {
+    libxl__ao *root = ao->nested_root ? : ao;
+    assert(!root->nested_root);
+    return root;
+}
+
 _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
 {
-    /* We only use the parent to get the ctx.  However, we require the
-     * caller to provide us with an ao, not just a ctx, to prove that
-     * they are already in an asynchronous operation.  That will avoid
-     * people using this to (for example) make an ao in a non-ao_how
-     * function somewhere in the middle of libxl. */
-    libxl__ao *child = NULL;
+    libxl__ao *child = NULL, *root;
     libxl_ctx *ctx = libxl__gc_owner(&parent->gc);
 
     assert(parent->magic == LIBXL__AO_MAGIC);
+    root = ao_nested_root(parent);
 
     child = libxl__zalloc(&ctx->nogc_gc, sizeof(*child));
     child->magic = LIBXL__AO_MAGIC;
-    child->nested = 1;
+    child->nested_root = root;
     LIBXL_INIT_GC(child->gc, ctx);
     libxl__gc *gc = &child->gc;
 
@@ -1913,7 +1918,7 @@ _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
 _hidden void libxl__nested_ao_free(libxl__ao *child)
 {
     assert(child->magic == LIBXL__AO_MAGIC);
-    assert(child->nested);
+    assert(child->nested_root);
     libxl_ctx *ctx = libxl__gc_owner(&child->gc);
     libxl__ao__destroy(ctx, child);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b2109ea..3213d73 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -443,7 +443,8 @@ struct libxl__ao {
      * only in libxl__ao_complete.)
      */
     uint32_t magic;
-    unsigned constructing:1, in_initiator:1, complete:1, notified:1, nested:1;
+    unsigned constructing:1, in_initiator:1, complete:1, notified:1;
+    libxl__ao *nested_root;
     int progress_reports_outstanding;
     int rc;
     libxl__gc gc;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/14] libxl: ao: Count the nested progeny of an ao
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (9 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 10/14] libxl: ao: Record ultimate parent of a nested ao Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 12/14] libxl: ao: Provide manip_refcnt Ian Jackson
                                       ` (2 subsequent siblings)
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

This will detect any "escaped" nested aos.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    |    8 +++++++-
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 8eb3483..04964c8 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1704,6 +1704,7 @@ void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc)
     assert(ao->magic == LIBXL__AO_MAGIC);
     assert(!ao->complete);
     assert(!ao->nested_root);
+    assert(!ao->nested_progeny);
     ao->complete = 1;
     ao->rc = rc;
 
@@ -1908,6 +1909,8 @@ _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
     child = libxl__zalloc(&ctx->nogc_gc, sizeof(*child));
     child->magic = LIBXL__AO_MAGIC;
     child->nested_root = root;
+    assert(root->nested_progeny < INT_MAX);
+    root->nested_progeny++;
     LIBXL_INIT_GC(child->gc, ctx);
     libxl__gc *gc = &child->gc;
 
@@ -1918,7 +1921,10 @@ _hidden libxl__ao *libxl__nested_ao_create(libxl__ao *parent)
 _hidden void libxl__nested_ao_free(libxl__ao *child)
 {
     assert(child->magic == LIBXL__AO_MAGIC);
-    assert(child->nested_root);
+    libxl__ao *root = child->nested_root;
+    assert(root);
+    assert(root->nested_progeny > 0);
+    root->nested_progeny--;
     libxl_ctx *ctx = libxl__gc_owner(&child->gc);
     libxl__ao__destroy(ctx, child);
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 3213d73..d2f0372 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -445,6 +445,7 @@ struct libxl__ao {
     uint32_t magic;
     unsigned constructing:1, in_initiator:1, complete:1, notified:1;
     libxl__ao *nested_root;
+    int nested_progeny;
     int progress_reports_outstanding;
     int rc;
     libxl__gc gc;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 12/14] libxl: ao: Provide manip_refcnt
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (10 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 11/14] libxl: ao: Count the nested progeny of an ao Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 13/14] libxl: ao: Cancellation API Ian Jackson
  2013-12-20 18:45                     ` [PATCH 14/14] libxl: ao: Timeouts are cancellable Ian Jackson
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Previously we used in_initiator to stop the ao being freed while we
were still in the initiator function (which would result in the
initiator's call to lixl__ao_inprogress accessing the ao after it had
been freed).

We are going to introduce a new libxl entrypoint which finds, and
operates on, ongoing aos.  This function needs the same protection,
and might even end up running on the same ao multiple times
concurrently.

So do this with reference counting instead, with a new variable
ao->manip_refcnt.

We keep ao->in_initiator because that allows us to keep some useful
asserts about the sequencing of libxl__ao_inprogress, etc.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    |   43 +++++++++++++++++++++++++++++++++---------
 tools/libxl/libxl_internal.h |    1 +
 2 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 04964c8..d4e5697 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -33,6 +33,8 @@
 
 static libxl__ao *ao_nested_root(libxl__ao *ao);
 
+static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao);
+
 
 /*
  * The counter osevent_in_hook is used to ensure that the application
@@ -1309,8 +1311,7 @@ static void egc_run_callbacks(libxl__egc *egc)
         ao->how.callback(CTX, ao->rc, ao->how.u.for_callback);
         CTX_LOCK;
         ao->notified = 1;
-        if (!ao->in_initiator)
-            libxl__ao__destroy(CTX, ao);
+        ao__check_destroy(CTX, ao);
         CTX_UNLOCK;
     }
 }
@@ -1668,6 +1669,33 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event **event_r,
  *                              - destroy the ao
  */
 
+
+/*
+ * A "manip" is a libxl public function manipulating this ao, which
+ * has a pointer to it.  We have to not destroy it while that's the
+ * case, obviously.
+ */
+static void ao__manip_enter(libxl__ao *ao)
+{
+    assert(ao->manip_refcnt < INT_MAX);
+    ao->manip_refcnt++;
+}
+
+static void ao__manip_leave(libxl_ctx *ctx, libxl__ao *ao)
+{
+    assert(ao->manip_refcnt > 0);
+    ao->manip_refcnt--;
+    ao__check_destroy(ctx, ao);
+}
+
+static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao)
+{
+    if (!ao->manip_refcnt && ao->notified) {
+        assert(ao->complete);
+        libxl__ao__destroy(ctx,ao);
+    }
+}
+
 void libxl__ao__destroy(libxl_ctx *ctx, libxl__ao *ao)
 {
     AO_GC;
@@ -1743,8 +1771,8 @@ void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao)
         }
         ao->notified = 1;
     }
-    if (!ao->in_initiator && ao->notified)
-        libxl__ao__destroy(ctx, ao);
+    
+    ao__check_destroy(ctx, ao);
 }
 
 libxl__ao *libxl__ao_create(libxl_ctx *ctx, uint32_t domid,
@@ -1759,6 +1787,7 @@ libxl__ao *libxl__ao_create(libxl_ctx *ctx, uint32_t domid,
     ao->magic = LIBXL__AO_MAGIC;
     ao->constructing = 1;
     ao->in_initiator = 1;
+    ao__manip_enter(ao);
     ao->poller = 0;
     ao->domid = domid;
     LIBXL_INIT_GC(ao->gc, ctx);
@@ -1839,11 +1868,7 @@ int libxl__ao_inprogress(libxl__ao *ao,
     }
 
     ao->in_initiator = 0;
-
-    if (ao->notified) {
-        assert(ao->complete);
-        libxl__ao__destroy(CTX,ao);
-    }
+    ao__manip_leave(CTX, ao);
 
     return rc;
 }
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index d2f0372..bb239ac 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -444,6 +444,7 @@ struct libxl__ao {
      */
     uint32_t magic;
     unsigned constructing:1, in_initiator:1, complete:1, notified:1;
+    int manip_refcnt;
     libxl__ao *nested_root;
     int nested_progeny;
     int progress_reports_outstanding;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 13/14] libxl: ao: Cancellation API
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (11 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 12/14] libxl: ao: Provide manip_refcnt Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  2013-12-20 18:45                     ` [PATCH 14/14] libxl: ao: Timeouts are cancellable Ian Jackson
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Provide libxl_ao_cancel.

There is machinery to allow an ao to register an interest in its
cancellation, using a libxl__ao_cancellable.

This API is not currently very functional: attempting cancellation it
will always return NOTIMPLEMENTED and have no effect.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl.c          |    3 ++
 tools/libxl/libxl.h          |   64 ++++++++++++++++++++++
 tools/libxl/libxl_event.c    |  122 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |   42 ++++++++++++++-
 4 files changed, 230 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 0efef98..903dd6e 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -64,6 +64,8 @@ int libxl_ctx_alloc(libxl_ctx **pctx, int version,
     LIBXL_LIST_INIT(&ctx->evtchns_waiting);
     libxl__ev_fd_init(&ctx->evtchn_efd);
 
+    LIBXL_LIST_INIT(&ctx->aos_inprogress);
+
     LIBXL_TAILQ_INIT(&ctx->death_list);
     libxl__ev_xswatch_init(&ctx->death_watch);
 
@@ -161,6 +163,7 @@ int libxl_ctx_free(libxl_ctx *ctx)
     assert(LIBXL_LIST_EMPTY(&ctx->efds));
     assert(LIBXL_TAILQ_EMPTY(&ctx->etimes));
     assert(LIBXL_LIST_EMPTY(&ctx->evtchns_waiting));
+    assert(LIBXL_LIST_EMPTY(&ctx->aos_inprogress));
 
     if (ctx->xch) xc_interface_close(ctx->xch);
     libxl_version_info_dispose(&ctx->version_info);
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index 784226b..c533ee3 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -409,6 +409,11 @@
  */
 #define LIBXL_HAVE_DRIVER_DOMAIN_CREATION 1
 
+/*
+ * LIBXL_HAVE_AO_CANCEL indicates the availability of libxl_ao_cancel
+ */
+#define LIBXL_HAVE_AO_CANCEL 1
+
 /* Functions annotated with LIBXL_EXTERNAL_CALLERS_ONLY may not be
  * called from within libxl itself. Callers outside libxl, who
  * do not #include libxl_internal.h, are fine. */
@@ -597,6 +602,65 @@ typedef struct {
     void *for_callback; /* passed to callback */
 } libxl_asyncprogress_how;
 
+/*
+ * It is sometimes possible to cancel an asynchronous operation.
+ *
+ * libxl_ao_cancel searches for an ongoing asynchronous operation whose
+ * ao_how is identical to *how, and tries to cancel it.  The return
+ * values from libxl_ao_cancel are as follows:
+ *
+ *  0
+ *
+ *     The operation in question has (at least some) support for
+ *     cancellation.  It will be cut short.  However, it may still
+ *     take some time to cancel.
+ *
+ *  ERROR_NOTFOUND
+ *
+ *      No matching ongoing operation was found.  This might happen
+ *      for an actual operation if the operation has already completed
+ *      (perhaps on another thread).  The call to libxl_ao_cancel has
+ *      had no effect.
+ *
+ *  ERROR_NOTIMPLEMENTED
+ *
+ *     As far as could be determined, the operation in question does
+ *     not support cancellation.  The operation may subsequently
+ *     complete normally, as if it had never been cancelled; however,
+ *     the cancellation attempt will still have been noted and it is
+ *     possible that the operation will be successfully cancelled.
+ *
+ *  ERROR_CANCELLED
+ *
+ *     The operation has already been the subject of a at least one
+ *     call to libxl_ao_cancel.
+ *
+ * If the operation was indeed cut short due to the cancellation, it
+ * will complete, at some point in the future, with ERROR_CANCELLED.
+ * In that case, depending on the operation it have performed some of
+ * the work in question and left the operation half-done.  Consult the
+ * documentation for individual operations.
+ *
+ * Note that a cancelled operation might still fail for other reasons
+ * even after it has been cancelled.
+ *
+ * If your application is multithreaded you must not reuse an
+ * ao_how->for_event or ao_how->for_callback value (with a particular
+ * ao_how->callback) unless you are sure that none of your other
+ * threads are going to cancel the previous operation using that
+ * value; otherwise you risk cancelling the wrong operation if the
+ * intended target of the cancellation completes in the meantime.
+ *
+ * It is possible to cancel even an operation which is being performed
+ * synchronously, but since in that case how==NULL you had better only
+ * have one such operation, because it is not possible to tell them
+ * apart.  (And, if you want to do this, obviously the cancellation
+ * would have to be requested on a different thread.)
+ */
+int libxl_ao_cancel(libxl_ctx *ctx, const libxl_asyncop_how *how)
+                    LIBXL_EXTERNAL_CALLERS_ONLY;
+
+
 #define LIBXL_VERSION 0
 
 /* context functions */
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index d4e5697..e982858 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -1715,6 +1715,7 @@ void libxl__ao_abort(libxl__ao *ao)
     assert(ao->in_initiator);
     assert(!ao->complete);
     assert(!ao->progress_reports_outstanding);
+    assert(!ao->cancelling);
     libxl__ao__destroy(CTX, ao);
 }
 
@@ -1874,6 +1875,127 @@ int libxl__ao_inprogress(libxl__ao *ao,
 }
 
 
+/* cancellation */
+
+static int ao__cancel(libxl_ctx *ctx, libxl__ao *parent)
+{
+    int rc;
+    ao__manip_enter(parent);
+
+    if (parent->cancelling) {
+        rc = ERROR_CANCELLED;
+        goto out;
+    }
+
+    parent->cancelling = 1;
+
+    if (LIBXL_LIST_EMPTY(&parent->cancellables)) {
+        LIBXL__LOG(ctx, XTL_DEBUG,
+                   "ao %p: cancellation requested, but not not implemented",
+                   parent);
+        rc = ERROR_NOTIMPLEMENTED;
+        goto out;
+    }
+
+    /* We keep calling cancellation hooks until there are none left */
+    while (!LIBXL_LIST_EMPTY(&parent->cancellables)) {
+        libxl__egc egc;
+        LIBXL_INIT_EGC(egc,ctx);
+
+        assert(!parent->complete);
+
+        libxl__ao_cancellable *canc = LIBXL_LIST_FIRST(&parent->cancellables);
+        assert(parent == ao_nested_root(canc->ao));
+
+        LIBXL_LIST_REMOVE(canc, entry);
+        canc->registered = 0;
+
+        LIBXL__LOG(ctx, XTL_DEBUG, "ao %p: canc=%p: cancelling",
+                   parent, canc->ao);
+        canc->callback(&egc, canc, ERROR_CANCELLED);
+
+        libxl__ctx_unlock(ctx);
+        libxl__egc_cleanup(&egc);
+        libxl__ctx_lock(ctx);
+    }
+
+    rc = 0;
+
+ out:
+    ao__manip_leave(ctx, parent);
+    return rc;
+}
+
+_hidden int libxl_ao_cancel(libxl_ctx *ctx, const libxl_asyncop_how *how)
+{
+    libxl__ao *search;
+    libxl__ctx_lock(ctx);
+    int rc;
+
+    LIBXL_LIST_FOREACH(search, &ctx->aos_inprogress, inprogress_entry) {
+        if (how) {
+            /* looking for ao to be reported by callback or event */
+            if (search->poller)
+                /* sync */
+                continue;
+            if (how->callback != search->how.callback)
+                continue;
+            if (how->callback
+                ? (how->u.for_callback != search->how.u.for_callback)
+                : (how->u.for_event != search->how.u.for_event))
+                continue;
+        } else {
+            /* looking for synchronous call */
+            if (!search->poller)
+                /* async */
+                continue;
+        }
+        goto found;
+    }
+    rc = ERROR_NOTFOUND;
+    goto out;
+
+ found:
+    rc = ao__cancel(ctx, search);
+ out:
+    libxl__ctx_unlock(ctx);
+    return rc;
+}
+
+int libxl__ao_cancellable_register(libxl__ao_cancellable *canc)
+{
+    libxl__ao *ao = canc->ao;
+    libxl__ao *root = ao_nested_root(ao);
+    AO_GC;
+
+    if (root->cancelling) {
+ DBG("ao=%p: preemptively cancelling cancellable registration %p (root=%p)",
+            ao, canc, root);
+        return ERROR_CANCELLED;
+    }
+
+    DBG("ao=%p, canc=%p: registering (root=%p)", ao, canc, root);
+    LIBXL_LIST_INSERT_HEAD(&root->cancellables, canc, entry);
+    canc->registered = 1;
+
+    return 0;
+}
+
+_hidden void libxl__ao_cancellable_deregister(libxl__ao_cancellable *canc)
+{
+    if (!canc->registered)
+        return;
+
+    libxl__ao *ao = canc->ao;
+    libxl__ao *root __attribute__((unused)) = ao_nested_root(ao);
+    AO_GC;
+
+    DBG("ao=%p, canc=%p: deregistering (root=%p)", ao, canc, root);
+    LIBXL_LIST_REMOVE(canc, entry);
+    canc->registered = 0;
+}
+
+
 /* progress reporting */
 
 /* The application indicates a desire to ignore events by passing NULL
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index bb239ac..951a77d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -170,6 +170,41 @@ struct libxl__ev_fd {
 };
 
 
+typedef struct libxl__ao_cancellable libxl__ao_cancellable;
+typedef void libxl__ao_cancellable_callback(libxl__egc *egc,
+                  libxl__ao_cancellable *cancellable, int rc /* CANCELLED */);
+
+struct libxl__ao_cancellable {
+    /* caller must fill this in and it must remain valid */
+    libxl__ao *ao;
+    libxl__ao_cancellable_callback *callback;
+    /* remainder is private for cancellation machinery */
+    bool registered;
+    LIBXL_LIST_ENTRY(libxl__ao_cancellable) entry;
+    /*
+     * For nested aos:
+     *  Semantically, cancellation affects the whole tree of aos,
+     *    not just the parent.
+     *  libxl__ao_cancellable.ao refers to the child, so
+     *    that the child callback sees the right ao.  (After all,
+     *    it was code dealing with the child that set .ao.)
+     *  But, the cancellable is recorded on the "cancellables" list
+     *    for the ultimate root ao, so that every possible child
+     *    cancellation occurs as a result of the cancellation of the
+     *    parent.
+     *  We set ao->cancelling only in the root.
+     */
+};
+
+_hidden int libxl__ao_cancellable_register(libxl__ao_cancellable*);
+_hidden void libxl__ao_cancellable_deregister(libxl__ao_cancellable*);
+
+static inline void libxl__ao_cancellable_init
+  (libxl__ao_cancellable *c) { c->registered = 0; }
+static inline bool libxl__ao_cancellable_isregistered
+  (const libxl__ao_cancellable *c) { return c->registered; }
+
+
 typedef struct libxl__ev_time libxl__ev_time;
 typedef void libxl__ev_time_callback(libxl__egc *egc, libxl__ev_time *ev,
                                      const struct timeval *requested_abs,
@@ -359,6 +394,8 @@ struct libxl__ctx {
     LIBXL_LIST_HEAD(, libxl__ev_evtchn) evtchns_waiting;
     libxl__ev_fd evtchn_efd;
 
+    LIBXL_LIST_HEAD(, libxl__ao) aos_inprogress;
+
     LIBXL_TAILQ_HEAD(libxl__evgen_domain_death_list, libxl_evgen_domain_death)
         death_list /* sorted by domid */,
         death_reported;
@@ -443,12 +480,15 @@ struct libxl__ao {
      * only in libxl__ao_complete.)
      */
     uint32_t magic;
-    unsigned constructing:1, in_initiator:1, complete:1, notified:1;
+    unsigned constructing:1, in_initiator:1, complete:1, notified:1,
+        cancelling:1;
     int manip_refcnt;
     libxl__ao *nested_root;
     int nested_progeny;
     int progress_reports_outstanding;
     int rc;
+    LIBXL_LIST_HEAD(, libxl__ao_cancellable) cancellables;
+    LIBXL_LIST_ENTRY(libxl__ao) inprogress_entry;
     libxl__gc gc;
     libxl_asyncop_how how;
     libxl__poller *poller;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 14/14] libxl: ao: Timeouts are cancellable
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
                                       ` (12 preceding siblings ...)
  2013-12-20 18:45                     ` [PATCH 13/14] libxl: ao: Cancellation API Ian Jackson
@ 2013-12-20 18:45                     ` Ian Jackson
  13 siblings, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2013-12-20 18:45 UTC (permalink / raw)
  To: xen-devel; +Cc: Ian Jackson, Ian Campbell

Make libxl__ev_time* register with the cancellation machinery, so that
libxl_ao_cancel can cancel any operation which has a timeout.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
 tools/libxl/libxl_event.c    |   27 +++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |    3 ++-
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index e982858..8f85162 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -292,6 +292,8 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev,
 
 static void time_deregister(libxl__gc *gc, libxl__ev_time *ev)
 {
+    libxl__ao_cancellable_deregister(&ev->cancel);
+
     if (!ev->infinite) {
         struct timeval right_away = { 0, 0 };
         if (ev->nexus) /* only set if app provided hooks */
@@ -314,6 +316,23 @@ static void time_done_debug(libxl__gc *gc, const char *func,
 #endif
 }
 
+static void time_cancelled(libxl__egc *egc, libxl__ao_cancellable *canc, int rc)
+{
+    libxl__ev_time *ev = CONTAINER_OF(canc, *ev, cancel);
+    EGC_GC;
+
+    time_deregister(gc, ev);
+    DBG("ev_time=%p cancelled", ev);
+    ev->func(egc, ev, &ev->abs, rc);
+}
+
+static int time_register_cancel(libxl__ao *ao, libxl__ev_time *ev)
+{
+    ev->cancel.ao = ao;
+    ev->cancel.callback = time_cancelled;
+    return libxl__ao_cancellable_register(&ev->cancel);
+}
+
 int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
                                 libxl__ev_time_callback *func,
                                 struct timeval absolute)
@@ -326,6 +345,9 @@ int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
     DBG("ev_time=%p register abs=%lu.%06lu",
         ev, (unsigned long)absolute.tv_sec, (unsigned long)absolute.tv_usec);
 
+    rc = time_register_cancel(ao, ev);
+    if (rc) goto out;
+
     rc = time_register_finite(gc, ev, absolute);
     if (rc) goto out;
 
@@ -333,6 +355,7 @@ int libxl__ev_time_register_abs(libxl__ao *ao, libxl__ev_time *ev,
 
     rc = 0;
  out:
+    libxl__ao_cancellable_deregister(&ev->cancel);
     time_done_debug(gc,__func__,ev,rc);
     CTX_UNLOCK;
     return rc;
@@ -351,6 +374,9 @@ int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
 
     DBG("ev_time=%p register ms=%d", ev, milliseconds);
 
+    rc = time_register_cancel(ao, ev);
+    if (rc) goto out;
+
     if (milliseconds < 0) {
         ev->infinite = 1;
     } else {
@@ -365,6 +391,7 @@ int libxl__ev_time_register_rel(libxl__ao *ao, libxl__ev_time *ev,
     rc = 0;
 
  out:
+    libxl__ao_cancellable_deregister(&ev->cancel);
     time_done_debug(gc,__func__,ev,rc);
     CTX_UNLOCK;
     return rc;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 951a77d..927f67e 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -218,6 +218,7 @@ struct libxl__ev_time {
     LIBXL_TAILQ_ENTRY(libxl__ev_time) entry;
     struct timeval abs;
     libxl__osevent_hook_nexus *nexus;
+    libxl__ao_cancellable cancel;
 };
 
 typedef struct libxl__ev_xswatch libxl__ev_xswatch;
@@ -790,7 +791,7 @@ _hidden int libxl__ev_time_modify_abs(libxl__gc*, libxl__ev_time *ev,
                                       struct timeval);
 _hidden void libxl__ev_time_deregister(libxl__gc*, libxl__ev_time *ev);
 static inline void libxl__ev_time_init(libxl__ev_time *ev)
-                { ev->func = 0; }
+                { ev->func = 0; libxl__ao_cancellable_init(&ev->cancel); }
 static inline int libxl__ev_time_isregistered(const libxl__ev_time *ev)
                 { return !!ev->func; }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2013-12-20 18:24                 ` Ian Jackson
  2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
@ 2014-03-14 10:42                   ` Ian Campbell
  2014-03-14 12:32                     ` Simon Beaumont
  2014-03-14 17:09                     ` Ian Jackson
  1 sibling, 2 replies; 29+ messages in thread
From: Ian Campbell @ 2014-03-14 10:42 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Simon Beaumont, Jim Fehlig, xs-devel, xen-devel

On Fri, 2013-12-20 at 18:24 +0000, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> > On Fri, 2013-11-08 at 18:38 +0000, Ian Jackson wrote:
> > > I've been thinking about this some more and looking at the code.
> > > 
> > > I have the following sketch of an approach:
> 
> I have now an RFC patch series, which I'm going to send as a reply to
> this email.

Si, do you have any feedback on the shape of/usefulness of the
interfaces added in this series before I start digging in and reviewing
the implementation?

Jim, does libvirt have any potential uses for cancelling such
operations? If so then perhaps you have an opinion on the interfaces
too?

The series is at:
http://thread.gmane.org/gmane.comp.emulators.xen.devel/175852/focus=183992

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2014-03-14 10:42                   ` FW: Cancelling asynchronous operations in libxl Ian Campbell
@ 2014-03-14 12:32                     ` Simon Beaumont
  2014-03-14 17:09                     ` Ian Jackson
  1 sibling, 0 replies; 29+ messages in thread
From: Simon Beaumont @ 2014-03-14 12:32 UTC (permalink / raw)
  To: Ian Campbell, Dave Scott, Rob Hoes
  Cc: Ian Jackson, Jim Fehlig, xs-devel, xen-devel

(+Dave, Rob)

Thanks Ian,

I’m adding Rob and Dave who will be able to better comment.

Si
On 14 Mar 2014, at 10:42, Ian Campbell <Ian.Campbell@citrix.com> wrote:

> On Fri, 2013-12-20 at 18:24 +0000, Ian Jackson wrote:
>> Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
>>> On Fri, 2013-11-08 at 18:38 +0000, Ian Jackson wrote:
>>>> I've been thinking about this some more and looking at the code.
>>>> 
>>>> I have the following sketch of an approach:
>> 
>> I have now an RFC patch series, which I'm going to send as a reply to
>> this email.
> 
> Si, do you have any feedback on the shape of/usefulness of the
> interfaces added in this series before I start digging in and reviewing
> the implementation?
> 
> Jim, does libvirt have any potential uses for cancelling such
> operations? If so then perhaps you have an opinion on the interfaces
> too?
> 
> The series is at:
> http://thread.gmane.org/gmane.comp.emulators.xen.devel/175852/focus=183992
> 
> Ian.
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: FW: Cancelling asynchronous operations in libxl
  2014-03-14 10:42                   ` FW: Cancelling asynchronous operations in libxl Ian Campbell
  2014-03-14 12:32                     ` Simon Beaumont
@ 2014-03-14 17:09                     ` Ian Jackson
  1 sibling, 0 replies; 29+ messages in thread
From: Ian Jackson @ 2014-03-14 17:09 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Simon Beaumont, Jim Fehlig, xs-devel, xen-devel

Ian Campbell writes ("Re: [Xen-devel] FW: Cancelling asynchronous operations in libxl"):
> On Fri, 2013-12-20 at 18:24 +0000, Ian Jackson wrote:
> > I have now an RFC patch series, which I'm going to send as a reply to
> > this email.
> 
> Si, do you have any feedback on the shape of/usefulness of the
> interfaces added in this series before I start digging in and reviewing
> the implementation?
> 
> Jim, does libvirt have any potential uses for cancelling such
> operations? If so then perhaps you have an opinion on the interfaces
> too?
> 
> The series is at:
> http://thread.gmane.org/gmane.comp.emulators.xen.devel/175852/focus=183992

I've pushed this here:

  http://xenbits.xen.org/gitweb/?p=people/iwj/xen.git;a=shortlog;h=refs/heads/wip.ao-cancel

NB that this is a rebasing branch.

(I haven't done anything to it since I posted it.  In particular, it's
not been rebased to post-4.4 staging.)

Ian.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* FW: Cancelling asynchronous operations in libxl
@ 2013-10-15 16:08 Simon Beaumont
  0 siblings, 0 replies; 29+ messages in thread
From: Simon Beaumont @ 2013-10-15 16:08 UTC (permalink / raw)
  To: xen-devel; +Cc: xs-devel

We're in the process of porting xenopsd[1] to libxl, rather than driving libxc manually.

Using our libxc-based backend, we are able to cancel operations. For operations that are using XenStore watches we wrap these in a cancellable_watch[2], and for operations that make use of a subprocess we send SIGKILL when we wish to cancel the associated task. We would then instrument any necessary cleanup by hand on a best-effort basis.

With the move to libxl, a lot of this control will be abstracted away and it is unclear how best to allow long-running tasks to be cancelled. It seems most of these operations could be executed asynchronously but we wonder if it is possible to cancel them, or how we could add cancellation functionality to these operations?

Cheers,

Si

[1]: https://github.com/xapi-project/xenopsd
[2]: https://github.com/xapi-project/xenopsd/blob/master/xc/cancel_utils.ml#L101

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-03-14 17:09 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <4B8F5D33B081C044AA43634E84ED7F9616A83D@AMSPEX01CL03.citrite.net>
2013-10-23 17:23 ` FW: Cancelling asynchronous operations in libxl Konrad Rzeszutek Wilk
2013-10-26  8:33   ` Ian Campbell
     [not found]   ` <1382776392.22417.179.camel@hastur.hellion.org.uk>
2013-10-28  9:38     ` Simon Beaumont
2013-10-28 15:52     ` Ian Jackson
2013-10-31 13:52       ` Ian Campbell
2013-10-31 14:32         ` Ian Jackson
2013-10-31 17:09           ` Ian Campbell
2013-11-08 18:38             ` Ian Jackson
2013-11-20 11:01               ` Ian Campbell
2013-12-20 18:24                 ` Ian Jackson
2013-12-20 18:45                   ` [RFC PATCH 00/14] libxl: Asynchronous event cancellation Ian Jackson
2013-12-20 18:45                     ` [PATCH 01/14] libxl: suspend: switch_logdirty_done takes rc Ian Jackson
2013-12-20 18:45                     ` [PATCH 02/14] libxl: suspend: common suspend callbacks take rc Ian Jackson
2013-12-20 18:45                     ` [PATCH 03/14] libxl: suspend: Return correct error from callbacks Ian Jackson
2013-12-20 18:45                     ` [PATCH 04/14] libxl: Use libxl__xswait* in libxl__ao_device Ian Jackson
2013-12-20 18:45                     ` [PATCH 05/14] libxl: xswait/devstate: Move xswait to before devstate Ian Jackson
2013-12-20 18:45                     ` [PATCH 06/14] libxl: devstate: Use libxl__xswait* Ian Jackson
2013-12-20 18:45                     ` [PATCH 07/14] libxl: New error codes CANCELLED etc Ian Jackson
2013-12-20 18:45                     ` [PATCH 08/14] libxl: events: Permit timeouts to signal cancellation Ian Jackson
2013-12-20 18:45                     ` [PATCH 09/14] libxl: domain create: Do not destroy on cancellation Ian Jackson
2013-12-20 18:45                     ` [PATCH 10/14] libxl: ao: Record ultimate parent of a nested ao Ian Jackson
2013-12-20 18:45                     ` [PATCH 11/14] libxl: ao: Count the nested progeny of an ao Ian Jackson
2013-12-20 18:45                     ` [PATCH 12/14] libxl: ao: Provide manip_refcnt Ian Jackson
2013-12-20 18:45                     ` [PATCH 13/14] libxl: ao: Cancellation API Ian Jackson
2013-12-20 18:45                     ` [PATCH 14/14] libxl: ao: Timeouts are cancellable Ian Jackson
2014-03-14 10:42                   ` FW: Cancelling asynchronous operations in libxl Ian Campbell
2014-03-14 12:32                     ` Simon Beaumont
2014-03-14 17:09                     ` Ian Jackson
2013-10-15 16:08 Simon Beaumont

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.