From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=43662 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OvXJf-0005pu-Ti for qemu-devel@nongnu.org; Tue, 14 Sep 2010 11:21:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OvXJe-0000mF-RG for qemu-devel@nongnu.org; Tue, 14 Sep 2010 11:21:03 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:45693) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OvXJe-0000m6-NF for qemu-devel@nongnu.org; Tue, 14 Sep 2010 11:21:02 -0400 Received: by gxk22 with SMTP id 22so2298747gxk.4 for ; Tue, 14 Sep 2010 08:21:02 -0700 (PDT) Message-ID: <4C8F92D9.2000908@codemonkey.ws> Date: Tue, 14 Sep 2010 10:20:57 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] qcow2 performance plan References: <4C8F7394.8060802@redhat.com> <4C8F7BE4.5010102@codemonkey.ws> <4C8F9087.2050005@redhat.com> In-Reply-To: <4C8F9087.2050005@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Avi Kivity , qemu-devel On 09/14/2010 10:11 AM, Kevin Wolf wrote: > Am 14.09.2010 15:43, schrieb Anthony Liguori: > >> Hi Avi, >> >> On 09/14/2010 08:07 AM, Avi Kivity wrote: >> >>> Here's a draft of a plan that should improve qcow2 performance. It's >>> written in wiki syntax for eventual upload to wiki.qemu.org; lines >>> starting with # are numbered lists, not comments. >>> >> Thanks for putting this together. I think it's really useful to think >> through the problem before anyone jumps in and starts coding. >> >> >>> = Basics = >>> >>> At the minimum level, no operation should block the main thread. This >>> could be done in two ways: extending the state machine so that each >>> blocking operation can be performed asynchronously >>> (bdrv_aio_*) >>> or by threading: each new operation is handed off to a worker thread. >>> Since a full state machine is prohibitively complex, this document >>> will discuss threading. >>> >> There's two distinct requirements that must be satisfied by a fast block >> device. The device must have fast implementations of aio functions and >> it must support concurrent request processing. >> >> If an aio function blocks in the process of submitting the request, it's >> by definition slow. But even if you may the aio functions fast, you >> still need to be able to support concurrent request processing in order >> to achieve high throughput. >> >> I'm not going to comment in depth on your threading proposal. When it >> comes to adding concurrency, I think any approach will require a rewrite >> of the qcow2 code and if the author of that rewrite is more comfortable >> implementing concurrency with threads than with a state machine, I'm >> happy with a threaded implementation. >> >> I'd suggest avoiding hyperbole like "a full state machine is >> prohibitively complex". QED is a full state machine. qcow2 adds a >> number of additional states because of the additional metadata and sync >> operations but it's not an exponential increase in complexity. >> > It will be quite some additional states that qcow2 brings in, but I > suspect the really hard thing is getting the dependencies between > requests right. > > I just had a look at how QED is doing this, and it seems to take the > easy solution, namely allowing only one allocation at the same time. One L2 allocation, not cluster allocations. You can allocate multiple clusters concurrently and you can read/write L2s concurrently. Since L2 allocation only happens every 2GB, it's a rare event. > So > this is roughly equivalent to Avi's worker thread that runs today's > qcow2 code and is protected by a global mutex. > No. First, you would have to dive deeply into the code and drop the qemu_mutex any time there is a synchronous IO op. However, the code is likely making assumptions today whereas the introduction of re-entrance in QEMU could break things. Basically, any time there's a unlock of qemu_mutex(), any state outside of qcow2 may have changed. Are you 100% confident that's okay in the current code? But with Avi's proposal, you don't have concurrency with any cluster allocation or read/write of metadata so it's not the same as QED today. > No doubt compared to real concurrency this is relatively easy to achieve > in either model (but probably easier with threads). > FWIW, concurrent L2 allocation is not that hard in QED. It just wasn't that important for the first implementation but it involves adding a work queue to each CachedL2Table such that you can stall multiple requests to the same L2 table while it's being allocated. It doesn't change the state machine at all. Regards, Anthony Liguori > Kevin >