From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [PATCH] fix race condition between libvirtd event handling and libxl fd deregister Date: Fri, 23 Nov 2012 09:43:42 +0000 Message-ID: <1353663822.13542.183.camel@zakaz.uk.xensource.com> References: <0451e6041bdd88c90eee.1353395794@linux-bjrd.bjz> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <0451e6041bdd88c90eee.1353395794@linux-bjrd.bjz> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Bamvor Jian Zhang Cc: "jfehlig@suse.com" , Ian Jackson , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org On Tue, 2012-11-20 at 07:16 +0000, Bamvor Jian Zhang wrote: > the race condition may be encounted at the following senaro: > (1), xenlight will remove fd handle just after the transfer is done according to > the buffer pointer. This action will first mark fd handle deleted in libvirtd > then remove fd handler from list in libxl. To mark the fd deleted in libvirtd, > the libvirt event loop mutex must be acquired. > > (2), meanwhile in the libvirt event dispatch process, libvirt will check the fd > deleted flag while holding the event loop mutex. At this time, "(1)" may be > blocked from marking the deleted flag. Then libvirt release its mutex temperary > to run the dispatcher callback. But this callback need xenlight lock(CTX_LOCK) > which is held by xenlight fd deregister function. So, libvirtd will continue to > run this callback after fd deregister exit which means xenlight has been marked > deleted flag, removed this fd handler and set ev->fd as -1. after > libxl__ev_fd_deregister exit, it is time for callback running. but > unfortunately, this callback has been removed as I mentioned above. > > reference the following graph: > libvirt event dispatch xenlight transfer done > | enter libxl__ev_fd_deregister > | CTX_LOCK > | | > | | > | enter osevent_fd_deregister > | | > | enter virEventRemoveHandle > | waiting virMutexLock > check handler delete flag | > virMutexUnlock | > | virMutexLock > enter libxl_osevent_occurred_fd | > waiting CTX_LOCK mark delete flag > | virMutexUnlock > | | > | exit virEventRemoveHandle > | exit osevent_fd_deregister > | | > | remove fd handler from list > | set ev->fd as -1 > | CTX_UNLOCK > CTX_LOCK > assert(fd==ev->fd) //lead to crash > call back in libxl > CTX_UNLOCK > exit libxl_osevent_occurred_fd This all seems pretty plausible to me, but I'd like to have an Ack from Ian J before I commit. > at the same time, i found the times of file handler register is less > than the times of file handler deregister. is that right? seems that > it will be better if the register and deregister is paired. How many extra file handles are we talking about? Presumably some long lived file handles will stay around until you call libxl_ctx_free for the ctx. Or are you doing this and saying you are seeing leaked fd's? > Signed-off-by: Bamvor Jian Zhang > > diff -r 321f8487379b -r 0451e6041bdd tools/libxl/libxl_event.c > --- a/tools/libxl/libxl_event.c Thu Nov 15 10:25:29 2012 +0000 > +++ b/tools/libxl/libxl_event.c Tue Nov 20 14:56:04 2012 +0800 > @@ -1018,11 +1018,15 @@ void libxl_osevent_occurred_fd(libxl_ctx > CTX_LOCK; > assert(!CTX->osevent_in_hook); > > + if (!libxl__ev_fd_isregistered(ev)) { > + DBG("ev_fd=%p deregister unregistered",ev); > + goto out; > + } > assert(fd == ev->fd); > revents &= ev->events; > if (revents) > ev->func(egc, ev, fd, ev->events, revents); > - > +out: > CTX_UNLOCK; > EGC_FREE; > } > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel