From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41469) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wu0IX-0003Gq-8e for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:11:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wu0IP-0005I1-N6 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:11:41 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2094) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wu0IP-0005Hv-A2 for qemu-devel@nongnu.org; Mon, 09 Jun 2014 10:11:33 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s59EBWaV031121 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) for ; Mon, 9 Jun 2014 10:11:32 -0400 Message-ID: <5395C091.5010701@redhat.com> Date: Mon, 09 Jun 2014 16:11:29 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1402322375-18899-1-git-send-email-stefanha@redhat.com> In-Reply-To: <1402322375-18899-1-git-send-email-stefanha@redhat.com> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] docs/multiple-iothreads.txt: add documentation on IOThread programming List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi , qemu-devel@nongnu.org Cc: Kevin Wolf , Fam Zheng Il 09/06/2014 15:59, Stefan Hajnoczi ha scritto: > This document explains how IOThreads and the main loop are related, > especially how to write code that can run in an IOThread. Currently on > virtio-blk-data-plane uses these techniques. The next obvious target is > virtio-scsi; there has also been work on virtio-net. > > Signed-off-by: Stefan Hajnoczi > --- > docs/multiple-iothreads.txt | 124 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 124 insertions(+) > create mode 100644 docs/multiple-iothreads.txt > > diff --git a/docs/multiple-iothreads.txt b/docs/multiple-iothreads.txt > new file mode 100644 > index 0000000..f2b008d > --- /dev/null > +++ b/docs/multiple-iothreads.txt > @@ -0,0 +1,124 @@ > +This document explains the IOThread feature and how to write code that runs > +outside the QEMU global mutex. > + > +The main loop and IOThreads > +--------------------------- > +QEMU is an event-driven program that can do several things at once using an > +event loop. The VNC server and the QMP monitor are both processed from the > +same event loop which monitors their file descriptors until they become > +readable and then invokes a callback. > + > +The default event loop is called the main loop (see main-loop.c). It is > +possible to create additional event loop threads using -object > +iothread,id=my-iothread. > + > +Side note: The main loop and IOThread are both event loops but their code is > +not shared completely. Sometimes it is useful to remember that although they > +are conceptually similar they are currently not interchangeable. Actually, the main loop does include all the iothread code. So you could say that the main loop is a superset of the iothread. > +How to program for IOThreads > +---------------------------- > +The main difference between legacy code and new code that can run in an > +IOThread is dealing explicitly with the event loop object, AioContext > +(see include/block/aio.h). Code that only works in the main loop > +implicitly uses the main loop's AioContext. Code that supports running > +in IOThreads must be aware of its AioContext. > + > +AioContext supports the following services: > + * File descriptor monitoring (read/write/error) POSIX only, at least for now. > + * Event notifiers (inter-thread signalling) > + * Timers > + * Bottom Halves (BH) deferred callbacks > + > +There are several old APIs that use the main loop AioContext: > + * LEGACY qemu_aio_set_fd_handler() - monitor a file descriptor > + * LEGACY qemu_aio_set_event_notifier() - monitor an event notifier seems to be unused > + * LEGACY timer_new_ms() - create a timer > + * LEGACY qemu_bh_new() - create a BH > + * LEGACY qemu_aio_wait() - run an event loop iteration also seems to be unused except for qemu-io-cmds.c (and easily removed from there). Perhaps add a note (here or elsewhere) that timer_new_ms/qemu_bh_new should never be used in the block layer? > +Since they implicitly work on the main loop they cannot be used in code that > +runs in an IOThread. They might cause a crash or deadlock if called from an > +IOThread since the QEMU global mutex is not held. > + > +Instead, use the AioContext functions directly (see include/block/aio.h): > + * aio_set_fd_handler() - monitor a file descriptor > + * aio_set_event_notifier() - monitor an event notifier > + * aio_timer_new() - create a timer > + * aio_bh_new() - create a BH > + * aio_poll() - run an event loop iteration > + > +The AioContext can be obtained from the IOThread using > +iothread_get_aio_context() or for the main loop using qemu_get_aio_context(). > +Code that takes an AioContext argument works both in IOThreads or the main > +loop, depending on which AioContext instance the caller passes in. Perfect. > +How to synchronize with an IOThread > +----------------------------------- > +AioContext is not thread-safe so some rules must be followed when using file > +descriptors, event notifiers, timers, or BHs across threads: > + > +1. AioContext functions can be called safely from file descriptor, event > +notifier, timer, or BH callbacks invoked by the AioContext. No locking is > +necessary. > + > +2. Other threads wishing to access the AioContext must use > +aio_context_acquire()/aio_context_release() for mutual exclusion. Once the > +context is acquired no other thread can access it or run event loop iterations > +in this AioContext. > + > +aio_context_acquire()/aio_context_release() calls may be nested. This > +means you can call them if you're not sure whether #1 applies. > + > +There is currently no lock ordering rule if a thread needs to acquire multiple > +AioContexts simultaneously. Therefore, it is only safe for code holding the > +QEMU global mutex to acquire other AioContexts. Good point (and a nice way out of the lock ordering quagmire...). Paolo > +Side note: the best way to schedule a function call across threads is to create > +a BH in the target AioContext beforehand and then call qemu_bh_schedule(). No > +acquire/release or locking is needed for the qemu_bh_schedule() call. But be > +sure to acquire the AioContext for aio_bh_new() if necessary. > + > +The relationship between AioContext and the block layer > +------------------------------------------------------- > +The AioContext originates from the QEMU block layer because it provides a > +scoped way of running event loop iterations until all work is done. This > +feature is used to complete all in-flight block I/O requests (see > +bdrv_drain_all()). Nowadays AioContext is a generic event loop that can be > +used by any QEMU subsystem. > + > +The block layer has support for AioContext integrated. Each BlockDriverState > +is associated with an AioContext using bdrv_set_aio_context() and > +bdrv_get_aio_context(). This allows block layer code to process I/O inside the > +right AioContext. Other subsystems may wish to follow a similar approach. > + > +If main loop code such as a QMP function wishes to access a BlockDriverState it > +must first call aio_context_acquire(bdrv_get_aio_context(bs)) to ensure the > +IOThread does not run in parallel. > + > +Long-running jobs (usually in the form of coroutines) are best scheduled in the > +BlockDriverState's AioContext to avoid the need to acquire/release around each > +bdrv_*() call. Be aware that there is currently no mechanism to get notified > +when bdrv_set_aio_context() moves this BlockDriverState to a different > +AioContext (see bdrv_detach_aio_context()/bdrv_attach_aio_context()), so you > +may need to add this if you want to support long-running jobs. >