From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59318) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c9GdH-0005w5-Sm for qemu-devel@nongnu.org; Tue, 22 Nov 2016 14:21:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c9GdC-0007Nz-Qy for qemu-devel@nongnu.org; Tue, 22 Nov 2016 14:21:31 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:58309) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c9GdC-0007LX-HG for qemu-devel@nongnu.org; Tue, 22 Nov 2016 14:21:26 -0500 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.17/8.16.0.17) with SMTP id uAMJJ91b135206 for ; Tue, 22 Nov 2016 14:21:24 -0500 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0a-001b2d01.pphosted.com with ESMTP id 26vu41k1mx-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Tue, 22 Nov 2016 14:21:23 -0500 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 22 Nov 2016 19:21:21 -0000 References: <1479832306-26440-1-git-send-email-stefanha@redhat.com> From: Christian Borntraeger Date: Tue, 22 Nov 2016 20:21:16 +0100 MIME-Version: 1.0 In-Reply-To: <1479832306-26440-1-git-send-email-stefanha@redhat.com> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 8bit Message-Id: Subject: Re: [Qemu-devel] [PATCH v3 00/10] aio: experimental virtio-blk polling mode List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi , qemu-devel@nongnu.org Cc: Paolo Bonzini , Karl Rister , Fam Zheng On 11/22/2016 05:31 PM, Stefan Hajnoczi wrote: > v3: > * Avoid ppoll(2)/epoll_wait(2) if polling succeeded [Paolo] > * Disable guest->host virtqueue notification during polling [Christian] > * Rebased on top of my virtio-blk/scsi virtqueue notification disable patches > > v2: > * Uninitialized node->deleted gone [Fam] > * Removed 1024 polling loop iteration qemu_clock_get_ns() optimization which > created a weird step pattern [Fam] > * Unified with AioHandler, dropped AioPollHandler struct [Paolo] > (actually I think Paolo had more in mind but this is the first step) > * Only poll when all event loop resources support it [Paolo] > * Added run_poll_handlers_begin/end trace events for perf analysis > * Sorry, Christian, no virtqueue kick suppression yet > > Recent performance investigation work done by Karl Rister shows that the > guest->host notification takes around 20 us. This is more than the "overhead" > of QEMU itself (e.g. block layer). > > One way to avoid the costly exit is to use polling instead of notification. > The main drawback of polling is that it consumes CPU resources. In order to > benefit performance the host must have extra CPU cycles available on physical > CPUs that aren't used by the guest. > > This is an experimental AioContext polling implementation. It adds a polling > callback into the event loop. Polling functions are implemented for virtio-blk > virtqueue guest->host kick and Linux AIO completion. > > The QEMU_AIO_POLL_MAX_NS environment variable sets the number of nanoseconds to > poll before entering the usual blocking poll(2) syscall. Try setting this > variable to the time from old request completion to new virtqueue kick. > > By default no polling is done. The QEMU_AIO_POLL_MAX_NS must be set to get any > polling! The notification suppression alone gives me about 10% for a single disk in fio throughput. (It seems that more disks make it help less???). If I set polling to high values (e.g. 500000) then the guest->host notification rate basically drops to zero, so it seems to work as expected. Polling also seems to provide some benefit in the range of another 10 percent (again only for a single disk?) So in general this looks promising. We want to keep it disabled as here until we have some grow/shrink heuristics. There is one thing that the kernel can do, which we cannot easily (check if the CPU is contented) and avoid polling in that case. One wild idea can be to use clock_gettime with CLOCK_THREAD_CPUTIME_ID and CLOCK_REALTIME and shrink polling if we have been scheduled away. The case "number of iothreads > number of cpus" looks better than in v1. Have you fixed something? > Stefan Hajnoczi (10): > virtio: add missing vdev->broken check > virtio-blk: suppress virtqueue kick during processing > virtio-scsi: suppress virtqueue kick during processing > aio: add AioPollFn and io_poll() interface > aio: add polling mode to AioContext > virtio: poll virtqueues for new buffers > linux-aio: poll ring for completions > virtio: turn vq->notification into a nested counter > aio: add .io_poll_begin/end() callbacks > virtio: disable virtqueue notifications during polling > > aio-posix.c | 197 +++++++++++++++++++++++++++++++++++++++++++- > async.c | 14 +++- > block/curl.c | 8 +- > block/iscsi.c | 3 +- > block/linux-aio.c | 19 ++++- > block/nbd-client.c | 8 +- > block/nfs.c | 7 +- > block/sheepdog.c | 26 +++--- > block/ssh.c | 4 +- > block/win32-aio.c | 4 +- > hw/block/virtio-blk.c | 18 ++-- > hw/scsi/virtio-scsi.c | 36 ++++---- > hw/virtio/virtio.c | 60 ++++++++++++-- > include/block/aio.h | 28 ++++++- > iohandler.c | 2 +- > nbd/server.c | 9 +- > stubs/set-fd-handler.c | 1 + > tests/test-aio.c | 4 +- > trace-events | 4 + > util/event_notifier-posix.c | 2 +- > 20 files changed, 378 insertions(+), 76 deletions(-) >