From: Stefan Hajnoczi <stefanha@gmail.com> To: Vivek Goyal <vgoyal@redhat.com> Cc: Anthony Liguori <anthony@codemonkey.ws>, kwolf@redhat.com, stefanha@linux.vnet.ibm.com, Mike Snitzer <snitzer@redhat.com>, guijianfeng@cn.fujitsu.com, qemu-devel@nongnu.org, wuzhy@cn.ibm.com, herbert@gondor.hengli.com.au, Joe Thornber <ejt@redhat.com>, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>, luowenj@cn.ibm.com, kvm@vger.kernel.org, zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, Ryan A Harper <raharper@us.ibm.com> Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits Date: Wed, 1 Jun 2011 23:28:27 +0100 [thread overview] Message-ID: <BANLkTinvO2Sku5jGwDu98EWa56BUhgvx6A@mail.gmail.com> (raw) In-Reply-To: <20110601214212.GB17449@redhat.com> On Wed, Jun 1, 2011 at 10:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Jun 01, 2011 at 10:15:30PM +0100, Stefan Hajnoczi wrote: >> One issue that concerns me is how effective iops and throughput are as >> capping mechanisms. If you cap throughput then you're likely to >> affect sequential I/O but do little against random I/O which can hog >> the disk with a seeky I/O pattern. If you limit iops you can cap >> random I/O but artifically limit sequential I/O, which may be able to >> perform a high number of iops without hogging the disk due to seek >> times at all. One proposed solution here (I think Christoph Hellwig >> suggested it) is to do something like merging sequential I/O counting >> so that multiple sequential I/Os only count as 1 iop. > > One of the things we atleast need to do is allow specifying both > bps and iops rule together so that random IO with high iops does > not create havoc and seqential or large size IO with low iops and > high bps does not overload the system. > > I am not sure how IO shows up in qemu but will elevator in guest > make sure that lot of sequential IO is merged together? For dependent > READS, I think counting multiple sequential reads as 1 iops might > help. I think this is one optimization one can do once throttling > starts working in qemu and see if it is a real concern. The guest can use an I/O scheduler, so for Linux guests we see the typical effects of cfq. Requests do get merged by the guest before being submitted to QEMU. Okay, good idea. Zhi Yong's test plan includes tests with multiple VMs and both iops and throughput limits at the same time. If workloads turn up that cause issues it would be possible at counting sequential I/Os a 1 iop. >> >> I like the idea of a proportional share of disk utilization but doing >> that from QEMU is problematic since we only know when we issued an I/O >> to the kernel, not when it's actually being serviced by the disk - >> there could be queue wait times in the block layer that we don't know >> about - so we end up with a magic number for disk utilization which >> may not be a very meaningful number. > > To be able to implement proportional IO one should be able to see > all IO from all clients at one place. Qemu knows about IO of only > its guest and not other guests running on the system. So I think > qemu can't implement proportion IO. Yeah :( >> >> So given the constraints and the backends we need to support, disk I/O >> limits in QEMU with iops and throughput limits seem like the approach >> we need. > > For qemu yes. For other non-qemu usages we will still require a kernel > mechanism of throttling. Definitely. In fact I like the idea of using blkio-controller for raw image files on local file systems or LVM volumes. Hopefully the end-user API (libvirt interface) that QEMU disk I/O limits gets exposed from complements the existing blkiotune (blkio-controller) virsh command. Stefan
WARNING: multiple messages have this Message-ID (diff)
From: Stefan Hajnoczi <stefanha@gmail.com> To: Vivek Goyal <vgoyal@redhat.com> Cc: kwolf@redhat.com, stefanha@linux.vnet.ibm.com, Mike Snitzer <snitzer@redhat.com>, guijianfeng@cn.fujitsu.com, qemu-devel@nongnu.org, wuzhy@cn.ibm.com, herbert@gondor.hengli.com.au, Joe Thornber <ejt@redhat.com>, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>, luowenj@cn.ibm.com, kvm@vger.kernel.org, zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, Ryan A Harper <raharper@us.ibm.com> Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits Date: Wed, 1 Jun 2011 23:28:27 +0100 [thread overview] Message-ID: <BANLkTinvO2Sku5jGwDu98EWa56BUhgvx6A@mail.gmail.com> (raw) In-Reply-To: <20110601214212.GB17449@redhat.com> On Wed, Jun 1, 2011 at 10:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Jun 01, 2011 at 10:15:30PM +0100, Stefan Hajnoczi wrote: >> One issue that concerns me is how effective iops and throughput are as >> capping mechanisms. If you cap throughput then you're likely to >> affect sequential I/O but do little against random I/O which can hog >> the disk with a seeky I/O pattern. If you limit iops you can cap >> random I/O but artifically limit sequential I/O, which may be able to >> perform a high number of iops without hogging the disk due to seek >> times at all. One proposed solution here (I think Christoph Hellwig >> suggested it) is to do something like merging sequential I/O counting >> so that multiple sequential I/Os only count as 1 iop. > > One of the things we atleast need to do is allow specifying both > bps and iops rule together so that random IO with high iops does > not create havoc and seqential or large size IO with low iops and > high bps does not overload the system. > > I am not sure how IO shows up in qemu but will elevator in guest > make sure that lot of sequential IO is merged together? For dependent > READS, I think counting multiple sequential reads as 1 iops might > help. I think this is one optimization one can do once throttling > starts working in qemu and see if it is a real concern. The guest can use an I/O scheduler, so for Linux guests we see the typical effects of cfq. Requests do get merged by the guest before being submitted to QEMU. Okay, good idea. Zhi Yong's test plan includes tests with multiple VMs and both iops and throughput limits at the same time. If workloads turn up that cause issues it would be possible at counting sequential I/Os a 1 iop. >> >> I like the idea of a proportional share of disk utilization but doing >> that from QEMU is problematic since we only know when we issued an I/O >> to the kernel, not when it's actually being serviced by the disk - >> there could be queue wait times in the block layer that we don't know >> about - so we end up with a magic number for disk utilization which >> may not be a very meaningful number. > > To be able to implement proportional IO one should be able to see > all IO from all clients at one place. Qemu knows about IO of only > its guest and not other guests running on the system. So I think > qemu can't implement proportion IO. Yeah :( >> >> So given the constraints and the backends we need to support, disk I/O >> limits in QEMU with iops and throughput limits seem like the approach >> we need. > > For qemu yes. For other non-qemu usages we will still require a kernel > mechanism of throttling. Definitely. In fact I like the idea of using blkio-controller for raw image files on local file systems or LVM volumes. Hopefully the end-user API (libvirt interface) that QEMU disk I/O limits gets exposed from complements the existing blkiotune (blkio-controller) virsh command. Stefan
next prev parent reply other threads:[~2011-06-01 22:28 UTC|newest] Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top 2011-05-30 5:09 [Qemu-devel][RFC]QEMU disk I/O limits Zhi Yong Wu 2011-05-30 5:09 ` [Qemu-devel] [RFC]QEMU " Zhi Yong Wu 2011-05-31 13:45 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 13:45 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-05-31 13:50 ` Anthony Liguori 2011-05-31 13:50 ` [Qemu-devel] " Anthony Liguori 2011-05-31 14:04 ` Vivek Goyal 2011-05-31 14:04 ` [Qemu-devel] " Vivek Goyal 2011-05-31 14:25 ` Anthony Liguori 2011-05-31 17:59 ` Vivek Goyal 2011-05-31 17:59 ` Vivek Goyal 2011-05-31 18:39 ` Anthony Liguori 2011-05-31 18:39 ` Anthony Liguori 2011-05-31 19:24 ` Vivek Goyal 2011-05-31 19:24 ` Vivek Goyal 2011-05-31 23:30 ` Anthony Liguori 2011-06-01 13:20 ` Vivek Goyal 2011-06-01 21:15 ` Stefan Hajnoczi 2011-06-01 21:15 ` Stefan Hajnoczi 2011-06-01 21:42 ` Vivek Goyal 2011-06-01 21:42 ` Vivek Goyal 2011-06-01 22:28 ` Stefan Hajnoczi [this message] 2011-06-01 22:28 ` Stefan Hajnoczi 2011-06-04 8:54 ` Blue Swirl 2011-06-04 8:54 ` Blue Swirl 2011-05-31 20:48 ` Mike Snitzer 2011-05-31 20:48 ` [Qemu-devel] " Mike Snitzer 2011-05-31 22:22 ` Anthony Liguori 2011-05-31 13:56 ` [Qemu-devel][RFC]QEMU " Daniel P. Berrange 2011-05-31 13:56 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange 2011-05-31 14:10 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 14:10 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-05-31 14:19 ` [Qemu-devel][RFC]QEMU " Daniel P. Berrange 2011-05-31 14:19 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange 2011-05-31 14:28 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 14:28 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-05-31 15:28 ` [Qemu-devel][RFC]QEMU " Ryan Harper 2011-05-31 15:28 ` [Qemu-devel] [RFC]QEMU " Ryan Harper 2011-05-31 19:55 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 19:55 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-06-01 3:12 ` Zhi Yong Wu 2011-06-01 3:12 ` Zhi Yong Wu 2011-06-02 9:33 ` Michal Suchanek 2011-06-02 9:33 ` Michal Suchanek 2011-06-03 6:56 ` Zhi Yong Wu 2011-06-03 6:56 ` Zhi Yong Wu 2011-06-01 3:19 ` Zhi Yong Wu 2011-06-01 3:19 ` Zhi Yong Wu 2011-06-01 13:32 ` Vivek Goyal 2011-06-02 6:07 ` Zhi Yong Wu 2011-06-02 6:17 ` Sasha Levin 2011-06-02 6:17 ` Sasha Levin 2011-06-02 6:29 ` Zhi Yong Wu 2011-06-02 7:15 ` Sasha Levin 2011-06-02 8:18 ` Zhi Yong Wu 2011-06-02 8:18 ` Zhi Yong Wu
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=BANLkTinvO2Sku5jGwDu98EWa56BUhgvx6A@mail.gmail.com \ --to=stefanha@gmail.com \ --cc=anthony@codemonkey.ws \ --cc=ejt@redhat.com \ --cc=guijianfeng@cn.fujitsu.com \ --cc=herbert@gondor.hengli.com.au \ --cc=kvm@vger.kernel.org \ --cc=kwolf@redhat.com \ --cc=llim@redhat.com \ --cc=luowenj@cn.ibm.com \ --cc=qemu-devel@nongnu.org \ --cc=raharper@us.ibm.com \ --cc=snitzer@redhat.com \ --cc=stefanha@linux.vnet.ibm.com \ --cc=vgoyal@redhat.com \ --cc=wuzhy@cn.ibm.com \ --cc=wuzhy@linux.vnet.ibm.com \ --cc=zhanx@cn.ibm.com \ --cc=zhaoyang@cn.ibm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.