From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Jackson Subject: Re: [RFC PATCH v2 00/29] libxl: Cancelling asynchronous operations Date: Tue, 7 Apr 2015 18:19:52 +0100 Message-ID: <21796.4536.722643.245763@mariner.uk.xensource.com> References: <1423599016-32639-1-git-send-email-ian.jackson@eu.citrix.com> <20150218161035.GA4022@citrix.com> <20150407170842.GD3099@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20150407170842.GD3099@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Euan Harris Cc: xen-devel@lists.xensource.com, james.bulpin@citrix.com List-Id: xen-devel@lists.xenproject.org Euan Harris writes ("Re: [RFC PATCH v2 00/29] libxl: Cancelling asynchronous operations"): > On Wed, Feb 18, 2015 at 04:10:35PM +0000, Euan Harris wrote: > I think the most straightforward way to test the cancellation mechanism in > LibXL will be to adapt the way we test similar functionality in xenopsd: > > * define numbered 'cancellation points' at which cancellable operations > can be cancelled > * before testing a cancellable operation, pre-set the cancellation point > at which cancellation should be attempted > * when execution reaches the pre-set cancellation point, run the cancellation > procedure This seems likely to work. > This approach alone will not allow us to test asynchronous cancellation in > the middle of long-running operations, such as writing a suspend image > to disk - that will require a way to synchronize the test program with > the long-running operation. On the contrary, I think many long-running operations, such as suspend and migrations, involve multiple iterations of the libxl event loop. Actual suspend/migrate is done in a helper process; the main process is responsible for progress report handling, coordination, etc. > My first guess about how this might be done was: > > * add current cancellation point and a trigger point variables to the context > struct > * increment the counter and fire the cancellation logic in > libxl__ao_cancellable_register() > > In this way we could write a loop which iterated through all possible > cancellation points. However you pointed out that we cannot call > libxl_ao_cancel() while holding the context lock, so this idea needs > some refinement. One possibility would be to tell another thread to try > to do the cancellation immediately after we release the lock; another > option, if we didn't want to write a multi-thread test driver, > would be to do the cancellation at the top of libxl's event loop. The relevant function for this latter approach is eventloop_iteration in libxl_event.c. This is used by libxl whenever the caller specifies that a long-running operation is to be done synchronously (ao_how==0), which is what xl does. You might also consider whether to add a debug option for afterpoll_internal to make it return after every callback (ie, after the call to efd->func() and the call to time_occurs). That would allow you to inject cancellation in a slightly more fine-grained manner. > I think this captures roughly what we talked about. Please let me know > if I misunderstood or missed out any details. I also mentioned that you counting invocations of libxl__ao_cancellable_register is less than ideal because it is very coarse-grained. Regards, Ian.