From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53242) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XEvuh-0008WN-8S for qemu-devel@nongnu.org; Wed, 06 Aug 2014 03:45:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XEvua-0006VW-50 for qemu-devel@nongnu.org; Wed, 06 Aug 2014 03:45:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:11240) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XEvuZ-0006VI-Ta for qemu-devel@nongnu.org; Wed, 06 Aug 2014 03:45:28 -0400 Message-ID: <53E1DD0E.8080202@redhat.com> Date: Wed, 06 Aug 2014 09:45:18 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1407209598-2572-1-git-send-email-ming.lei@canonical.com> <20140805094844.GF4391@noname.str.redhat.com> <20140805134815.GD12251@stefanha-thinkpad.redhat.com> <20140805144728.GH4391@noname.str.redhat.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ming Lei , Kevin Wolf Cc: Peter Maydell , Fam Zheng , qemu-devel , Stefan Hajnoczi , "Michael S. Tsirkin" Il 06/08/2014 07:33, Ming Lei ha scritto: >> > I played a bit with the following, I hope it's not too naive. I couldn't >> > see a difference with your patches, but at least one reason for this is >> > probably that my laptop SSD isn't fast enough to make the CPU the >> > bottleneck. Haven't tried ramdisk yet, that would probably be the next >> > thing. (I actually wrote the patch up just for some profiling on my own, >> > not for comparing throughput, but it should be usable for that as well.) > This might not be good for the test since it is basically a sequential > read test, which can be optimized a lot by kernel. And I always use > randread benchmark. A microbenchmark already exists in tests/test-coroutine.c, and doesn't really tell us much; it's obvious that coroutines execute more code, the question is why it affects the iops performance. The sequential read should be the right workload. For fio, you want to get as many iops as possible to QEMU and so you need randread. But qemu-img is not run in a guest and if the kernel optimizes sequential reads then the bypass should have even more benefits because it makes userspace proportionally more expensive. In any case, the patches as written have no hope of being accepted. If you "invert" the logic from aio->co to co->aio, that would be much better even if it's tedious. Paolo