* [Qemu-devel][RFC]QEMU disk I/O limits @ 2011-05-30 5:09 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-05-30 5:09 UTC (permalink / raw) To: qemu-devel, kvm Cc: kwolf, vgoyal, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim Hello, all, I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. More detail is available here: http://wiki.qemu.org/Features/DiskIOLimits 1.) Why we need per-drive disk I/O limits As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. 2.) How disk I/O limits will be implemented QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. 3.) How the users enable and play with it QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. Regards, Zhiyong Wu ^ permalink raw reply [flat|nested] 56+ messages in thread
* [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-30 5:09 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-05-30 5:09 UTC (permalink / raw) To: qemu-devel, kvm Cc: kwolf, aliguori, herbert, guijianfeng, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper, vgoyal, stefanha Hello, all, I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. More detail is available here: http://wiki.qemu.org/Features/DiskIOLimits 1.) Why we need per-drive disk I/O limits As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. 2.) How disk I/O limits will be implemented QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. 3.) How the users enable and play with it QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. Regards, Zhiyong Wu ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-30 5:09 ` [Qemu-devel] [RFC]QEMU " Zhi Yong Wu @ 2011-05-31 13:45 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 13:45 UTC (permalink / raw) To: Zhi Yong Wu Cc: qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > Hello, all, > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > Hi Zhiyong, Why not use kernel blkio controller for this and why reinvent the wheel and implement the feature again in qemu? Thanks Vivek > More detail is available here: > http://wiki.qemu.org/Features/DiskIOLimits > > 1.) Why we need per-drive disk I/O limits > As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. > > Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. > > 2.) How disk I/O limits will be implemented > QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. > In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. > > 3.) How the users enable and play with it > QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. > > > Regards, > > Zhiyong Wu ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 13:45 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 13:45 UTC (permalink / raw) To: Zhi Yong Wu Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, luowenj, zhanx, zhaoyang, llim, raharper On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > Hello, all, > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > Hi Zhiyong, Why not use kernel blkio controller for this and why reinvent the wheel and implement the feature again in qemu? Thanks Vivek > More detail is available here: > http://wiki.qemu.org/Features/DiskIOLimits > > 1.) Why we need per-drive disk I/O limits > As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. > > Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. > > 2.) How disk I/O limits will be implemented > QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. > In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. > > 3.) How the users enable and play with it > QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. > > > Regards, > > Zhiyong Wu ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFC]QEMU disk I/O limits 2011-05-31 13:45 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal @ 2011-05-31 13:50 ` Anthony Liguori -1 siblings, 0 replies; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 13:50 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On 05/31/2011 08:45 AM, Vivek Goyal wrote: > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >> Hello, all, >> >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >> > > Hi Zhiyong, > > Why not use kernel blkio controller for this and why reinvent the wheel > and implement the feature again in qemu? blkio controller only works for block devices. It doesn't work when using files. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 13:50 ` Anthony Liguori 0 siblings, 0 replies; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 13:50 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On 05/31/2011 08:45 AM, Vivek Goyal wrote: > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >> Hello, all, >> >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >> > > Hi Zhiyong, > > Why not use kernel blkio controller for this and why reinvent the wheel > and implement the feature again in qemu? blkio controller only works for block devices. It doesn't work when using files. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFC]QEMU disk I/O limits 2011-05-31 13:50 ` [Qemu-devel] " Anthony Liguori @ 2011-05-31 14:04 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 14:04 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>Hello, all, > >> > >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >> > > > >Hi Zhiyong, > > > >Why not use kernel blkio controller for this and why reinvent the wheel > >and implement the feature again in qemu? > > blkio controller only works for block devices. It doesn't work when > using files. So can't we comeup with something to easily determine which device backs up this file? Though that will still not work for NFS backed storage though. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 14:04 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 14:04 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>Hello, all, > >> > >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >> > > > >Hi Zhiyong, > > > >Why not use kernel blkio controller for this and why reinvent the wheel > >and implement the feature again in qemu? > > blkio controller only works for block devices. It doesn't work when > using files. So can't we comeup with something to easily determine which device backs up this file? Though that will still not work for NFS backed storage though. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 14:04 ` [Qemu-devel] " Vivek Goyal (?) @ 2011-05-31 14:25 ` Anthony Liguori 2011-05-31 17:59 ` Vivek Goyal -1 siblings, 1 reply; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 14:25 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On 05/31/2011 09:04 AM, Vivek Goyal wrote: > On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: >> On 05/31/2011 08:45 AM, Vivek Goyal wrote: >>> On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >>>> Hello, all, >>>> >>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >>>> >>> >>> Hi Zhiyong, >>> >>> Why not use kernel blkio controller for this and why reinvent the wheel >>> and implement the feature again in qemu? >> >> blkio controller only works for block devices. It doesn't work when >> using files. > > So can't we comeup with something to easily determine which device backs > up this file? Though that will still not work for NFS backed storage > though. Right. Additionally, in QEMU, we can rate limit based on concepts that make sense to a guest. We can limit the actual I/O ops visible to the guest which means that we'll get consistent performance regardless of whether the backing file is qcow2, raw, LVM, or raw over NFS. The kernel just doesn't have enough information to do a good job here. Regards, Anthony Liguori > > Thanks > Vivek > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 14:25 ` Anthony Liguori @ 2011-05-31 17:59 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 17:59 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper, Mike Snitzer, Joe Thornber On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: > On 05/31/2011 09:04 AM, Vivek Goyal wrote: > >On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > >>On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>>>Hello, all, > >>>> > >>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >>>> > >>> > >>>Hi Zhiyong, > >>> > >>>Why not use kernel blkio controller for this and why reinvent the wheel > >>>and implement the feature again in qemu? > >> > >>blkio controller only works for block devices. It doesn't work when > >>using files. > > > >So can't we comeup with something to easily determine which device backs > >up this file? Though that will still not work for NFS backed storage > >though. > > Right. > > Additionally, in QEMU, we can rate limit based on concepts that make > sense to a guest. We can limit the actual I/O ops visible to the > guest which means that we'll get consistent performance regardless > of whether the backing file is qcow2, raw, LVM, or raw over NFS. > Are you referring to merging taking place which can change the definition of IOPS as seen by guest? We do throttling at bio level and no merging is taking place. So IOPS seen by guest and as seen by throttling logic should be same. Readahead would be one exception though where any readahead data will be charged to guest. Device throttling and interaction with file system is still an issue with IO controller (things like journalling lead to serialization) where a faster group can get blocked behind slower group. That's why at the moment, it is recommened that directly export devices/partitions to virtual machines if throttling is to be used and don't share a file system across VMs. > The kernel just doesn't have enough information to do a good job here. [CCing couple of device mapper folks for thoughts on below] When I think more about it, I think this problem is very similar to other features like snapshotting. Whether we should implement snapshotting in qemu or use some kernel based solution like dm-snaphot or dm-multisnap etc. I don't have a good answer for that. Has this detabe been settled already? I see that development is happening in kernel or providing dm snapshot capabilities and Mike Snitzer also mentioned about possibility of using dm-loop for covering the case of files over NFS etc. Some thoughts in general though. - Any kernel based solution is generic and can be used in other contexts also like containers or bare metal. - In some cases, kernel can implement throttling more efficiently. For example if a block devie has multiple partitions and these partitions are exported to VMs, then kernel can maintain a single queue and single set of timer to manage all the VMs doing IO to that device. In user space solution we shall have manage as many queues and timers as there are VMs. So kernel implementation can enable more efficient implementation in certain cases. - Things like dm-loop essentially means introduce another block layer on top of file system layer. I personally think that it does not sound very clean and might slow down things. Though I don't have any data. Has there been any discussion/conclusion on this? - qemu based scheme will work well with all kind of targets. For using kenrel based scheme one shall have to switch to using kernel provided snapshotting schemes (dm-snapshot or dm multisnap etc). Otherwise a READ might come from a base image which is on other device and we did not throttle the VM. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 17:59 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 17:59 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, kvm, guijianfeng, Mike Snitzer, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: > On 05/31/2011 09:04 AM, Vivek Goyal wrote: > >On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > >>On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>>>Hello, all, > >>>> > >>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >>>> > >>> > >>>Hi Zhiyong, > >>> > >>>Why not use kernel blkio controller for this and why reinvent the wheel > >>>and implement the feature again in qemu? > >> > >>blkio controller only works for block devices. It doesn't work when > >>using files. > > > >So can't we comeup with something to easily determine which device backs > >up this file? Though that will still not work for NFS backed storage > >though. > > Right. > > Additionally, in QEMU, we can rate limit based on concepts that make > sense to a guest. We can limit the actual I/O ops visible to the > guest which means that we'll get consistent performance regardless > of whether the backing file is qcow2, raw, LVM, or raw over NFS. > Are you referring to merging taking place which can change the definition of IOPS as seen by guest? We do throttling at bio level and no merging is taking place. So IOPS seen by guest and as seen by throttling logic should be same. Readahead would be one exception though where any readahead data will be charged to guest. Device throttling and interaction with file system is still an issue with IO controller (things like journalling lead to serialization) where a faster group can get blocked behind slower group. That's why at the moment, it is recommened that directly export devices/partitions to virtual machines if throttling is to be used and don't share a file system across VMs. > The kernel just doesn't have enough information to do a good job here. [CCing couple of device mapper folks for thoughts on below] When I think more about it, I think this problem is very similar to other features like snapshotting. Whether we should implement snapshotting in qemu or use some kernel based solution like dm-snaphot or dm-multisnap etc. I don't have a good answer for that. Has this detabe been settled already? I see that development is happening in kernel or providing dm snapshot capabilities and Mike Snitzer also mentioned about possibility of using dm-loop for covering the case of files over NFS etc. Some thoughts in general though. - Any kernel based solution is generic and can be used in other contexts also like containers or bare metal. - In some cases, kernel can implement throttling more efficiently. For example if a block devie has multiple partitions and these partitions are exported to VMs, then kernel can maintain a single queue and single set of timer to manage all the VMs doing IO to that device. In user space solution we shall have manage as many queues and timers as there are VMs. So kernel implementation can enable more efficient implementation in certain cases. - Things like dm-loop essentially means introduce another block layer on top of file system layer. I personally think that it does not sound very clean and might slow down things. Though I don't have any data. Has there been any discussion/conclusion on this? - qemu based scheme will work well with all kind of targets. For using kenrel based scheme one shall have to switch to using kernel provided snapshotting schemes (dm-snapshot or dm multisnap etc). Otherwise a READ might come from a base image which is on other device and we did not throttle the VM. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 17:59 ` Vivek Goyal @ 2011-05-31 18:39 ` Anthony Liguori -1 siblings, 0 replies; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 18:39 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, Mike Snitzer, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On 05/31/2011 12:59 PM, Vivek Goyal wrote: > On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: >> On 05/31/2011 09:04 AM, Vivek Goyal wrote: >>> On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: >>>> On 05/31/2011 08:45 AM, Vivek Goyal wrote: >>>>> On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >>>>>> Hello, all, >>>>>> >>>>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >>>>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >>>>>> >>>>> >>>>> Hi Zhiyong, >>>>> >>>>> Why not use kernel blkio controller for this and why reinvent the wheel >>>>> and implement the feature again in qemu? >>>> >>>> blkio controller only works for block devices. It doesn't work when >>>> using files. >>> >>> So can't we comeup with something to easily determine which device backs >>> up this file? Though that will still not work for NFS backed storage >>> though. >> >> Right. >> >> Additionally, in QEMU, we can rate limit based on concepts that make >> sense to a guest. We can limit the actual I/O ops visible to the >> guest which means that we'll get consistent performance regardless >> of whether the backing file is qcow2, raw, LVM, or raw over NFS. >> > > Are you referring to merging taking place which can change the definition > of IOPS as seen by guest? No, with qcow2, it may take multiple real IOPs for what the guest sees as an IOP. That's really the main argument I'm making here. The only entity that knows what a guest IOP corresponds to is QEMU. On the backend, it may end up being a network request, multiple BIOs to physical disks, file access, etc. That's why QEMU is the right place to do the throttling for this use case. That doesn't mean device level throttling isn't useful but just that for virtualization, it makes more sense to do it in QEMU. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 18:39 ` Anthony Liguori 0 siblings, 0 replies; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 18:39 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On 05/31/2011 12:59 PM, Vivek Goyal wrote: > On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: >> On 05/31/2011 09:04 AM, Vivek Goyal wrote: >>> On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: >>>> On 05/31/2011 08:45 AM, Vivek Goyal wrote: >>>>> On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >>>>>> Hello, all, >>>>>> >>>>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >>>>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >>>>>> >>>>> >>>>> Hi Zhiyong, >>>>> >>>>> Why not use kernel blkio controller for this and why reinvent the wheel >>>>> and implement the feature again in qemu? >>>> >>>> blkio controller only works for block devices. It doesn't work when >>>> using files. >>> >>> So can't we comeup with something to easily determine which device backs >>> up this file? Though that will still not work for NFS backed storage >>> though. >> >> Right. >> >> Additionally, in QEMU, we can rate limit based on concepts that make >> sense to a guest. We can limit the actual I/O ops visible to the >> guest which means that we'll get consistent performance regardless >> of whether the backing file is qcow2, raw, LVM, or raw over NFS. >> > > Are you referring to merging taking place which can change the definition > of IOPS as seen by guest? No, with qcow2, it may take multiple real IOPs for what the guest sees as an IOP. That's really the main argument I'm making here. The only entity that knows what a guest IOP corresponds to is QEMU. On the backend, it may end up being a network request, multiple BIOs to physical disks, file access, etc. That's why QEMU is the right place to do the throttling for this use case. That doesn't mean device level throttling isn't useful but just that for virtualization, it makes more sense to do it in QEMU. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 18:39 ` Anthony Liguori @ 2011-05-31 19:24 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 19:24 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, kvm, guijianfeng, Mike Snitzer, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31, 2011 at 01:39:47PM -0500, Anthony Liguori wrote: > On 05/31/2011 12:59 PM, Vivek Goyal wrote: > >On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: > >>On 05/31/2011 09:04 AM, Vivek Goyal wrote: > >>>On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > >>>>On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >>>>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>>>>>Hello, all, > >>>>>> > >>>>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >>>>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >>>>>> > >>>>> > >>>>>Hi Zhiyong, > >>>>> > >>>>>Why not use kernel blkio controller for this and why reinvent the wheel > >>>>>and implement the feature again in qemu? > >>>> > >>>>blkio controller only works for block devices. It doesn't work when > >>>>using files. > >>> > >>>So can't we comeup with something to easily determine which device backs > >>>up this file? Though that will still not work for NFS backed storage > >>>though. > >> > >>Right. > >> > >>Additionally, in QEMU, we can rate limit based on concepts that make > >>sense to a guest. We can limit the actual I/O ops visible to the > >>guest which means that we'll get consistent performance regardless > >>of whether the backing file is qcow2, raw, LVM, or raw over NFS. > >> > > > >Are you referring to merging taking place which can change the definition > >of IOPS as seen by guest? > > No, with qcow2, it may take multiple real IOPs for what the guest > sees as an IOP. > > That's really the main argument I'm making here. The only entity > that knows what a guest IOP corresponds to is QEMU. On the backend, > it may end up being a network request, multiple BIOs to physical > disks, file access, etc. Ok, so we seem to be talking of two requirements. - A consistent experience to guest - Isolation between VMs. If this qcow2 mapping/metada overhead is not significant, then we don't have to worry about IOPs perceived by guest. It will be more or less same. If it is significant then we provide more consistent experience to guest but then weaken the isolation between guest and might overload the backend storage and in turn might not get the expected IOPS for the guest anyway. So I think these two things are not independent. I agree though that advantage of qemu is that everything is a file and handling all the complex configuraitons becomes very easy. Having said that, to provide a consistent experience to guest, you also need to know where IO from guest is going and whether underlying storage system can support that kind of IO or not. IO limits are of not much use if if these are put in isolation without knowing where IO is going and how many VMs are doing IO to it. Otherwise there are no gurantees/estimates on minimum bandwidth for guests hence there is no consistent experience. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 19:24 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 19:24 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31, 2011 at 01:39:47PM -0500, Anthony Liguori wrote: > On 05/31/2011 12:59 PM, Vivek Goyal wrote: > >On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: > >>On 05/31/2011 09:04 AM, Vivek Goyal wrote: > >>>On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > >>>>On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >>>>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>>>>>Hello, all, > >>>>>> > >>>>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >>>>>> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >>>>>> > >>>>> > >>>>>Hi Zhiyong, > >>>>> > >>>>>Why not use kernel blkio controller for this and why reinvent the wheel > >>>>>and implement the feature again in qemu? > >>>> > >>>>blkio controller only works for block devices. It doesn't work when > >>>>using files. > >>> > >>>So can't we comeup with something to easily determine which device backs > >>>up this file? Though that will still not work for NFS backed storage > >>>though. > >> > >>Right. > >> > >>Additionally, in QEMU, we can rate limit based on concepts that make > >>sense to a guest. We can limit the actual I/O ops visible to the > >>guest which means that we'll get consistent performance regardless > >>of whether the backing file is qcow2, raw, LVM, or raw over NFS. > >> > > > >Are you referring to merging taking place which can change the definition > >of IOPS as seen by guest? > > No, with qcow2, it may take multiple real IOPs for what the guest > sees as an IOP. > > That's really the main argument I'm making here. The only entity > that knows what a guest IOP corresponds to is QEMU. On the backend, > it may end up being a network request, multiple BIOs to physical > disks, file access, etc. Ok, so we seem to be talking of two requirements. - A consistent experience to guest - Isolation between VMs. If this qcow2 mapping/metada overhead is not significant, then we don't have to worry about IOPs perceived by guest. It will be more or less same. If it is significant then we provide more consistent experience to guest but then weaken the isolation between guest and might overload the backend storage and in turn might not get the expected IOPS for the guest anyway. So I think these two things are not independent. I agree though that advantage of qemu is that everything is a file and handling all the complex configuraitons becomes very easy. Having said that, to provide a consistent experience to guest, you also need to know where IO from guest is going and whether underlying storage system can support that kind of IO or not. IO limits are of not much use if if these are put in isolation without knowing where IO is going and how many VMs are doing IO to it. Otherwise there are no gurantees/estimates on minimum bandwidth for guests hence there is no consistent experience. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 19:24 ` Vivek Goyal (?) @ 2011-05-31 23:30 ` Anthony Liguori 2011-06-01 13:20 ` Vivek Goyal 2011-06-04 8:54 ` Blue Swirl -1 siblings, 2 replies; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 23:30 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On 05/31/2011 02:24 PM, Vivek Goyal wrote: > On Tue, May 31, 2011 at 01:39:47PM -0500, Anthony Liguori wrote: >> On 05/31/2011 12:59 PM, Vivek Goyal wrote: > Ok, so we seem to be talking of two requirements. > > - A consistent experience to guest > - Isolation between VMs. > > If this qcow2 mapping/metada overhead is not significant, then we > don't have to worry about IOPs perceived by guest. It will be more or less > same. If it is significant then we provide more consistent experience to > guest but then weaken the isolation between guest and might overload the > backend storage and in turn might not get the expected IOPS for the > guest anyway. That's quite a bit of hand waving considering your following argument is that you can't be precise enough at the QEMU level. > So I think these two things are not independent. > > I agree though that advantage of qemu is that everything is a file > and handling all the complex configuraitons becomes very easy. > > Having said that, to provide a consistent experience to guest, you > also need to know where IO from guest is going and whether underlying > storage system can support that kind of IO or not. > > IO limits are of not much use if if these are put in isolation without > knowing where IO is going and how many VMs are doing IO to it. Otherwise > there are no gurantees/estimates on minimum bandwidth for guests hence > there is no consistent experience. Consistent and maximum are two very different things. QEMU can, very effectively, enforce a maximum I/O rate. This can then be used to provide mostly consistent performance across different generations of hardware, to implement service levels in a tiered offering, etc. The level of consistency will then depend on whether you overcommit your hardware and how you have it configured. Consistency is very hard because at the end of the day, you still have shared resources. Even with blkio, I presume one guest can still impact another guest by forcing the disk to do excessive seeking or something of that nature. So absolutely consistency can't be the requirement for the use-case. The use-cases we are interested really are more about providing caps than anything else. Regards, Anthony Liguori > > Thanks > Vivek > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 23:30 ` Anthony Liguori @ 2011-06-01 13:20 ` Vivek Goyal 2011-06-01 21:15 ` Stefan Hajnoczi 2011-06-04 8:54 ` Blue Swirl 1 sibling, 1 reply; 56+ messages in thread From: Vivek Goyal @ 2011-06-01 13:20 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31, 2011 at 06:30:09PM -0500, Anthony Liguori wrote: [..] > The level of consistency will then depend on whether you overcommit > your hardware and how you have it configured. Agreed. > > Consistency is very hard because at the end of the day, you still > have shared resources. Even with blkio, I presume one guest can > still impact another guest by forcing the disk to do excessive > seeking or something of that nature. > > So absolutely consistency can't be the requirement for the use-case. > The use-cases we are interested really are more about providing caps > than anything else. I think both qemu and kenrel can do the job. The only thing which seriously favors throttling implementation in qemu is the ability to handle wide variety of backend files (NFS, qcow, libcurl based devices etc). So what I am arguing is that your previous reason that qemu can do a better job because it knows effective IOPS of guest, is not necessarily a very good reason. To me simplicity of being able to handle everything as file and do the throttling is the most compelling reason to do this implementation in qemu. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-01 13:20 ` Vivek Goyal @ 2011-06-01 21:15 ` Stefan Hajnoczi 0 siblings, 0 replies; 56+ messages in thread From: Stefan Hajnoczi @ 2011-06-01 21:15 UTC (permalink / raw) To: Vivek Goyal Cc: Anthony Liguori, kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 1, 2011 at 2:20 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Tue, May 31, 2011 at 06:30:09PM -0500, Anthony Liguori wrote: > > [..] >> The level of consistency will then depend on whether you overcommit >> your hardware and how you have it configured. > > Agreed. > >> >> Consistency is very hard because at the end of the day, you still >> have shared resources. Even with blkio, I presume one guest can >> still impact another guest by forcing the disk to do excessive >> seeking or something of that nature. >> >> So absolutely consistency can't be the requirement for the use-case. >> The use-cases we are interested really are more about providing caps >> than anything else. > > I think both qemu and kenrel can do the job. The only thing which > seriously favors throttling implementation in qemu is the ability > to handle wide variety of backend files (NFS, qcow, libcurl based > devices etc). > > So what I am arguing is that your previous reason that qemu can do > a better job because it knows effective IOPS of guest, is not > necessarily a very good reason. To me simplicity of being able to handle > everything as file and do the throttling is the most compelling reason > to do this implementation in qemu. The variety of backends is the reason to go for a QEMU-based approach. If there were kernel mechanisms to handle non-block backends that would be great. cgroups NFS? Of course for something like Sheepdog or Ceph it becomes quite hard to do it in the kernel at all since they are userspace libraries that speak their protocol over sockets, and you really don't have sinight into what I/O operations they are doing from the kernel. One issue that concerns me is how effective iops and throughput are as capping mechanisms. If you cap throughput then you're likely to affect sequential I/O but do little against random I/O which can hog the disk with a seeky I/O pattern. If you limit iops you can cap random I/O but artifically limit sequential I/O, which may be able to perform a high number of iops without hogging the disk due to seek times at all. One proposed solution here (I think Christoph Hellwig suggested it) is to do something like merging sequential I/O counting so that multiple sequential I/Os only count as 1 iop. I like the idea of a proportional share of disk utilization but doing that from QEMU is problematic since we only know when we issued an I/O to the kernel, not when it's actually being serviced by the disk - there could be queue wait times in the block layer that we don't know about - so we end up with a magic number for disk utilization which may not be a very meaningful number. So given the constraints and the backends we need to support, disk I/O limits in QEMU with iops and throughput limits seem like the approach we need. Stefan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-01 21:15 ` Stefan Hajnoczi 0 siblings, 0 replies; 56+ messages in thread From: Stefan Hajnoczi @ 2011-06-01 21:15 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 1, 2011 at 2:20 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Tue, May 31, 2011 at 06:30:09PM -0500, Anthony Liguori wrote: > > [..] >> The level of consistency will then depend on whether you overcommit >> your hardware and how you have it configured. > > Agreed. > >> >> Consistency is very hard because at the end of the day, you still >> have shared resources. Even with blkio, I presume one guest can >> still impact another guest by forcing the disk to do excessive >> seeking or something of that nature. >> >> So absolutely consistency can't be the requirement for the use-case. >> The use-cases we are interested really are more about providing caps >> than anything else. > > I think both qemu and kenrel can do the job. The only thing which > seriously favors throttling implementation in qemu is the ability > to handle wide variety of backend files (NFS, qcow, libcurl based > devices etc). > > So what I am arguing is that your previous reason that qemu can do > a better job because it knows effective IOPS of guest, is not > necessarily a very good reason. To me simplicity of being able to handle > everything as file and do the throttling is the most compelling reason > to do this implementation in qemu. The variety of backends is the reason to go for a QEMU-based approach. If there were kernel mechanisms to handle non-block backends that would be great. cgroups NFS? Of course for something like Sheepdog or Ceph it becomes quite hard to do it in the kernel at all since they are userspace libraries that speak their protocol over sockets, and you really don't have sinight into what I/O operations they are doing from the kernel. One issue that concerns me is how effective iops and throughput are as capping mechanisms. If you cap throughput then you're likely to affect sequential I/O but do little against random I/O which can hog the disk with a seeky I/O pattern. If you limit iops you can cap random I/O but artifically limit sequential I/O, which may be able to perform a high number of iops without hogging the disk due to seek times at all. One proposed solution here (I think Christoph Hellwig suggested it) is to do something like merging sequential I/O counting so that multiple sequential I/Os only count as 1 iop. I like the idea of a proportional share of disk utilization but doing that from QEMU is problematic since we only know when we issued an I/O to the kernel, not when it's actually being serviced by the disk - there could be queue wait times in the block layer that we don't know about - so we end up with a magic number for disk utilization which may not be a very meaningful number. So given the constraints and the backends we need to support, disk I/O limits in QEMU with iops and throughput limits seem like the approach we need. Stefan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-01 21:15 ` Stefan Hajnoczi @ 2011-06-01 21:42 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-06-01 21:42 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Anthony Liguori, kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 01, 2011 at 10:15:30PM +0100, Stefan Hajnoczi wrote: > On Wed, Jun 1, 2011 at 2:20 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > > On Tue, May 31, 2011 at 06:30:09PM -0500, Anthony Liguori wrote: > > > > [..] > >> The level of consistency will then depend on whether you overcommit > >> your hardware and how you have it configured. > > > > Agreed. > > > >> > >> Consistency is very hard because at the end of the day, you still > >> have shared resources. Even with blkio, I presume one guest can > >> still impact another guest by forcing the disk to do excessive > >> seeking or something of that nature. > >> > >> So absolutely consistency can't be the requirement for the use-case. > >> The use-cases we are interested really are more about providing caps > >> than anything else. > > > > I think both qemu and kenrel can do the job. The only thing which > > seriously favors throttling implementation in qemu is the ability > > to handle wide variety of backend files (NFS, qcow, libcurl based > > devices etc). > > > > So what I am arguing is that your previous reason that qemu can do > > a better job because it knows effective IOPS of guest, is not > > necessarily a very good reason. To me simplicity of being able to handle > > everything as file and do the throttling is the most compelling reason > > to do this implementation in qemu. > > The variety of backends is the reason to go for a QEMU-based approach. > If there were kernel mechanisms to handle non-block backends that > would be great. cgroups NFS? I agree that because qemu can handle variety of backends it becomes a very good reason to do throttling in qemu. Kernel currently does not handle files over NFS. There were some suggestions of using a loop or device mapper loop device on top of NFS images and then implement block device policies like throttling. But I am not convinced that it is a good idea. To cover the case of NFS we probably shall have to implement something in NFS or something more generic in VFS. But I am not sure if file system guys will like it or is it even worth at this point of time given the fact that primary use case is qemu and qemu can easily implement this funcitonality. > > Of course for something like Sheepdog or Ceph it becomes quite hard to > do it in the kernel at all since they are userspace libraries that > speak their protocol over sockets, and you really don't have sinight > into what I/O operations they are doing from the kernel. Agreed. This is another reason that why doing it in qemu makes sense. > > One issue that concerns me is how effective iops and throughput are as > capping mechanisms. If you cap throughput then you're likely to > affect sequential I/O but do little against random I/O which can hog > the disk with a seeky I/O pattern. If you limit iops you can cap > random I/O but artifically limit sequential I/O, which may be able to > perform a high number of iops without hogging the disk due to seek > times at all. One proposed solution here (I think Christoph Hellwig > suggested it) is to do something like merging sequential I/O counting > so that multiple sequential I/Os only count as 1 iop. One of the things we atleast need to do is allow specifying both bps and iops rule together so that random IO with high iops does not create havoc and seqential or large size IO with low iops and high bps does not overload the system. I am not sure how IO shows up in qemu but will elevator in guest make sure that lot of sequential IO is merged together? For dependent READS, I think counting multiple sequential reads as 1 iops might help. I think this is one optimization one can do once throttling starts working in qemu and see if it is a real concern. > > I like the idea of a proportional share of disk utilization but doing > that from QEMU is problematic since we only know when we issued an I/O > to the kernel, not when it's actually being serviced by the disk - > there could be queue wait times in the block layer that we don't know > about - so we end up with a magic number for disk utilization which > may not be a very meaningful number. To be able to implement proportional IO one should be able to see all IO from all clients at one place. Qemu knows about IO of only its guest and not other guests running on the system. So I think qemu can't implement proportion IO. > > So given the constraints and the backends we need to support, disk I/O > limits in QEMU with iops and throughput limits seem like the approach > we need. For qemu yes. For other non-qemu usages we will still require a kernel mechanism of throttling. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-01 21:42 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-06-01 21:42 UTC (permalink / raw) To: Stefan Hajnoczi Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 01, 2011 at 10:15:30PM +0100, Stefan Hajnoczi wrote: > On Wed, Jun 1, 2011 at 2:20 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > > On Tue, May 31, 2011 at 06:30:09PM -0500, Anthony Liguori wrote: > > > > [..] > >> The level of consistency will then depend on whether you overcommit > >> your hardware and how you have it configured. > > > > Agreed. > > > >> > >> Consistency is very hard because at the end of the day, you still > >> have shared resources. Even with blkio, I presume one guest can > >> still impact another guest by forcing the disk to do excessive > >> seeking or something of that nature. > >> > >> So absolutely consistency can't be the requirement for the use-case. > >> The use-cases we are interested really are more about providing caps > >> than anything else. > > > > I think both qemu and kenrel can do the job. The only thing which > > seriously favors throttling implementation in qemu is the ability > > to handle wide variety of backend files (NFS, qcow, libcurl based > > devices etc). > > > > So what I am arguing is that your previous reason that qemu can do > > a better job because it knows effective IOPS of guest, is not > > necessarily a very good reason. To me simplicity of being able to handle > > everything as file and do the throttling is the most compelling reason > > to do this implementation in qemu. > > The variety of backends is the reason to go for a QEMU-based approach. > If there were kernel mechanisms to handle non-block backends that > would be great. cgroups NFS? I agree that because qemu can handle variety of backends it becomes a very good reason to do throttling in qemu. Kernel currently does not handle files over NFS. There were some suggestions of using a loop or device mapper loop device on top of NFS images and then implement block device policies like throttling. But I am not convinced that it is a good idea. To cover the case of NFS we probably shall have to implement something in NFS or something more generic in VFS. But I am not sure if file system guys will like it or is it even worth at this point of time given the fact that primary use case is qemu and qemu can easily implement this funcitonality. > > Of course for something like Sheepdog or Ceph it becomes quite hard to > do it in the kernel at all since they are userspace libraries that > speak their protocol over sockets, and you really don't have sinight > into what I/O operations they are doing from the kernel. Agreed. This is another reason that why doing it in qemu makes sense. > > One issue that concerns me is how effective iops and throughput are as > capping mechanisms. If you cap throughput then you're likely to > affect sequential I/O but do little against random I/O which can hog > the disk with a seeky I/O pattern. If you limit iops you can cap > random I/O but artifically limit sequential I/O, which may be able to > perform a high number of iops without hogging the disk due to seek > times at all. One proposed solution here (I think Christoph Hellwig > suggested it) is to do something like merging sequential I/O counting > so that multiple sequential I/Os only count as 1 iop. One of the things we atleast need to do is allow specifying both bps and iops rule together so that random IO with high iops does not create havoc and seqential or large size IO with low iops and high bps does not overload the system. I am not sure how IO shows up in qemu but will elevator in guest make sure that lot of sequential IO is merged together? For dependent READS, I think counting multiple sequential reads as 1 iops might help. I think this is one optimization one can do once throttling starts working in qemu and see if it is a real concern. > > I like the idea of a proportional share of disk utilization but doing > that from QEMU is problematic since we only know when we issued an I/O > to the kernel, not when it's actually being serviced by the disk - > there could be queue wait times in the block layer that we don't know > about - so we end up with a magic number for disk utilization which > may not be a very meaningful number. To be able to implement proportional IO one should be able to see all IO from all clients at one place. Qemu knows about IO of only its guest and not other guests running on the system. So I think qemu can't implement proportion IO. > > So given the constraints and the backends we need to support, disk I/O > limits in QEMU with iops and throughput limits seem like the approach > we need. For qemu yes. For other non-qemu usages we will still require a kernel mechanism of throttling. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-01 21:42 ` Vivek Goyal @ 2011-06-01 22:28 ` Stefan Hajnoczi -1 siblings, 0 replies; 56+ messages in thread From: Stefan Hajnoczi @ 2011-06-01 22:28 UTC (permalink / raw) To: Vivek Goyal Cc: Anthony Liguori, kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 1, 2011 at 10:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Jun 01, 2011 at 10:15:30PM +0100, Stefan Hajnoczi wrote: >> One issue that concerns me is how effective iops and throughput are as >> capping mechanisms. If you cap throughput then you're likely to >> affect sequential I/O but do little against random I/O which can hog >> the disk with a seeky I/O pattern. If you limit iops you can cap >> random I/O but artifically limit sequential I/O, which may be able to >> perform a high number of iops without hogging the disk due to seek >> times at all. One proposed solution here (I think Christoph Hellwig >> suggested it) is to do something like merging sequential I/O counting >> so that multiple sequential I/Os only count as 1 iop. > > One of the things we atleast need to do is allow specifying both > bps and iops rule together so that random IO with high iops does > not create havoc and seqential or large size IO with low iops and > high bps does not overload the system. > > I am not sure how IO shows up in qemu but will elevator in guest > make sure that lot of sequential IO is merged together? For dependent > READS, I think counting multiple sequential reads as 1 iops might > help. I think this is one optimization one can do once throttling > starts working in qemu and see if it is a real concern. The guest can use an I/O scheduler, so for Linux guests we see the typical effects of cfq. Requests do get merged by the guest before being submitted to QEMU. Okay, good idea. Zhi Yong's test plan includes tests with multiple VMs and both iops and throughput limits at the same time. If workloads turn up that cause issues it would be possible at counting sequential I/Os a 1 iop. >> >> I like the idea of a proportional share of disk utilization but doing >> that from QEMU is problematic since we only know when we issued an I/O >> to the kernel, not when it's actually being serviced by the disk - >> there could be queue wait times in the block layer that we don't know >> about - so we end up with a magic number for disk utilization which >> may not be a very meaningful number. > > To be able to implement proportional IO one should be able to see > all IO from all clients at one place. Qemu knows about IO of only > its guest and not other guests running on the system. So I think > qemu can't implement proportion IO. Yeah :( >> >> So given the constraints and the backends we need to support, disk I/O >> limits in QEMU with iops and throughput limits seem like the approach >> we need. > > For qemu yes. For other non-qemu usages we will still require a kernel > mechanism of throttling. Definitely. In fact I like the idea of using blkio-controller for raw image files on local file systems or LVM volumes. Hopefully the end-user API (libvirt interface) that QEMU disk I/O limits gets exposed from complements the existing blkiotune (blkio-controller) virsh command. Stefan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-01 22:28 ` Stefan Hajnoczi 0 siblings, 0 replies; 56+ messages in thread From: Stefan Hajnoczi @ 2011-06-01 22:28 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 1, 2011 at 10:42 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Jun 01, 2011 at 10:15:30PM +0100, Stefan Hajnoczi wrote: >> One issue that concerns me is how effective iops and throughput are as >> capping mechanisms. If you cap throughput then you're likely to >> affect sequential I/O but do little against random I/O which can hog >> the disk with a seeky I/O pattern. If you limit iops you can cap >> random I/O but artifically limit sequential I/O, which may be able to >> perform a high number of iops without hogging the disk due to seek >> times at all. One proposed solution here (I think Christoph Hellwig >> suggested it) is to do something like merging sequential I/O counting >> so that multiple sequential I/Os only count as 1 iop. > > One of the things we atleast need to do is allow specifying both > bps and iops rule together so that random IO with high iops does > not create havoc and seqential or large size IO with low iops and > high bps does not overload the system. > > I am not sure how IO shows up in qemu but will elevator in guest > make sure that lot of sequential IO is merged together? For dependent > READS, I think counting multiple sequential reads as 1 iops might > help. I think this is one optimization one can do once throttling > starts working in qemu and see if it is a real concern. The guest can use an I/O scheduler, so for Linux guests we see the typical effects of cfq. Requests do get merged by the guest before being submitted to QEMU. Okay, good idea. Zhi Yong's test plan includes tests with multiple VMs and both iops and throughput limits at the same time. If workloads turn up that cause issues it would be possible at counting sequential I/Os a 1 iop. >> >> I like the idea of a proportional share of disk utilization but doing >> that from QEMU is problematic since we only know when we issued an I/O >> to the kernel, not when it's actually being serviced by the disk - >> there could be queue wait times in the block layer that we don't know >> about - so we end up with a magic number for disk utilization which >> may not be a very meaningful number. > > To be able to implement proportional IO one should be able to see > all IO from all clients at one place. Qemu knows about IO of only > its guest and not other guests running on the system. So I think > qemu can't implement proportion IO. Yeah :( >> >> So given the constraints and the backends we need to support, disk I/O >> limits in QEMU with iops and throughput limits seem like the approach >> we need. > > For qemu yes. For other non-qemu usages we will still require a kernel > mechanism of throttling. Definitely. In fact I like the idea of using blkio-controller for raw image files on local file systems or LVM volumes. Hopefully the end-user API (libvirt interface) that QEMU disk I/O limits gets exposed from complements the existing blkiotune (blkio-controller) virsh command. Stefan ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 23:30 ` Anthony Liguori @ 2011-06-04 8:54 ` Blue Swirl 2011-06-04 8:54 ` Blue Swirl 1 sibling, 0 replies; 56+ messages in thread From: Blue Swirl @ 2011-06-04 8:54 UTC (permalink / raw) To: Anthony Liguori Cc: Vivek Goyal, kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 1, 2011 at 2:30 AM, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 05/31/2011 02:24 PM, Vivek Goyal wrote: >> >> On Tue, May 31, 2011 at 01:39:47PM -0500, Anthony Liguori wrote: >>> >>> On 05/31/2011 12:59 PM, Vivek Goyal wrote: >> >> Ok, so we seem to be talking of two requirements. >> >> - A consistent experience to guest >> - Isolation between VMs. >> >> If this qcow2 mapping/metada overhead is not significant, then we >> don't have to worry about IOPs perceived by guest. It will be more or less >> same. If it is significant then we provide more consistent experience to >> guest but then weaken the isolation between guest and might overload the >> backend storage and in turn might not get the expected IOPS for the >> guest anyway. > > That's quite a bit of hand waving considering your following argument is > that you can't be precise enough at the QEMU level. > >> So I think these two things are not independent. >> >> I agree though that advantage of qemu is that everything is a file >> and handling all the complex configuraitons becomes very easy. >> >> Having said that, to provide a consistent experience to guest, you >> also need to know where IO from guest is going and whether underlying >> storage system can support that kind of IO or not. >> >> IO limits are of not much use if if these are put in isolation without >> knowing where IO is going and how many VMs are doing IO to it. Otherwise >> there are no gurantees/estimates on minimum bandwidth for guests hence >> there is no consistent experience. > > Consistent and maximum are two very different things. > > QEMU can, very effectively, enforce a maximum I/O rate. This can then be > used to provide mostly consistent performance across different generations > of hardware, to implement service levels in a tiered offering, etc. What is the point of view, guest or host? It is not possible to enforce any rates which would make sense to guests without taking into account guest clock and execution speed. If instead you mean host rate (which would not be in sync with I/O rates seen by guest), then I'd suppose metadata accesses would also matter and then the host facilities should produce same results. On a positive side, they may only exist on newer Linux and not on other OS so introducing them to QEMU would not be so bad idea. > The level of consistency will then depend on whether you overcommit your > hardware and how you have it configured. > > Consistency is very hard because at the end of the day, you still have > shared resources. Even with blkio, I presume one guest can still impact > another guest by forcing the disk to do excessive seeking or something of > that nature. > > So absolutely consistency can't be the requirement for the use-case. The > use-cases we are interested really are more about providing caps than > anything else. > > Regards, > > Anthony Liguori > >> >> Thanks >> Vivek >> > > > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-04 8:54 ` Blue Swirl 0 siblings, 0 replies; 56+ messages in thread From: Blue Swirl @ 2011-06-04 8:54 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper, Vivek Goyal On Wed, Jun 1, 2011 at 2:30 AM, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 05/31/2011 02:24 PM, Vivek Goyal wrote: >> >> On Tue, May 31, 2011 at 01:39:47PM -0500, Anthony Liguori wrote: >>> >>> On 05/31/2011 12:59 PM, Vivek Goyal wrote: >> >> Ok, so we seem to be talking of two requirements. >> >> - A consistent experience to guest >> - Isolation between VMs. >> >> If this qcow2 mapping/metada overhead is not significant, then we >> don't have to worry about IOPs perceived by guest. It will be more or less >> same. If it is significant then we provide more consistent experience to >> guest but then weaken the isolation between guest and might overload the >> backend storage and in turn might not get the expected IOPS for the >> guest anyway. > > That's quite a bit of hand waving considering your following argument is > that you can't be precise enough at the QEMU level. > >> So I think these two things are not independent. >> >> I agree though that advantage of qemu is that everything is a file >> and handling all the complex configuraitons becomes very easy. >> >> Having said that, to provide a consistent experience to guest, you >> also need to know where IO from guest is going and whether underlying >> storage system can support that kind of IO or not. >> >> IO limits are of not much use if if these are put in isolation without >> knowing where IO is going and how many VMs are doing IO to it. Otherwise >> there are no gurantees/estimates on minimum bandwidth for guests hence >> there is no consistent experience. > > Consistent and maximum are two very different things. > > QEMU can, very effectively, enforce a maximum I/O rate. This can then be > used to provide mostly consistent performance across different generations > of hardware, to implement service levels in a tiered offering, etc. What is the point of view, guest or host? It is not possible to enforce any rates which would make sense to guests without taking into account guest clock and execution speed. If instead you mean host rate (which would not be in sync with I/O rates seen by guest), then I'd suppose metadata accesses would also matter and then the host facilities should produce same results. On a positive side, they may only exist on newer Linux and not on other OS so introducing them to QEMU would not be so bad idea. > The level of consistency will then depend on whether you overcommit your > hardware and how you have it configured. > > Consistency is very hard because at the end of the day, you still have > shared resources. Even with blkio, I presume one guest can still impact > another guest by forcing the disk to do excessive seeking or something of > that nature. > > So absolutely consistency can't be the requirement for the use-case. The > use-cases we are interested really are more about providing caps than > anything else. > > Regards, > > Anthony Liguori > >> >> Thanks >> Vivek >> > > > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [RFC]QEMU disk I/O limits 2011-05-31 18:39 ` Anthony Liguori @ 2011-05-31 20:48 ` Mike Snitzer -1 siblings, 0 replies; 56+ messages in thread From: Mike Snitzer @ 2011-05-31 20:48 UTC (permalink / raw) To: Anthony Liguori Cc: Vivek Goyal, kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper On Tue, May 31 2011 at 2:39pm -0400, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 05/31/2011 12:59 PM, Vivek Goyal wrote: > >On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: > >>On 05/31/2011 09:04 AM, Vivek Goyal wrote: > >>>On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > >>>>On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >>>>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>>>>>Hello, all, > >>>>>> > >>>>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >>>>>> This feature will enable the user to cap disk I/O amount > performed by a VM.It is important for some storage resources to be > shared among multi-VMs. As you've known, if some of VMs are doing > excessive disk I/O, they will hurt the performance of other VMs. > >>>>>> > >>>>> > >>>>>Hi Zhiyong, > >>>>> > >>>>>Why not use kernel blkio controller for this and why reinvent the wheel > >>>>>and implement the feature again in qemu? > >>>> > >>>>blkio controller only works for block devices. It doesn't work when > >>>>using files. > >>> > >>>So can't we comeup with something to easily determine which device backs > >>>up this file? Though that will still not work for NFS backed storage > >>>though. > >> > >>Right. > >> > >>Additionally, in QEMU, we can rate limit based on concepts that make > >>sense to a guest. We can limit the actual I/O ops visible to the > >>guest which means that we'll get consistent performance regardless > >>of whether the backing file is qcow2, raw, LVM, or raw over NFS. > >> > > > >Are you referring to merging taking place which can change the definition > >of IOPS as seen by guest? > > No, with qcow2, it may take multiple real IOPs for what the guest > sees as an IOP. > > That's really the main argument I'm making here. The only entity > that knows what a guest IOP corresponds to is QEMU. On the backend, > it may end up being a network request, multiple BIOs to physical > disks, file access, etc. Couldn't QEMU give a hint to the kernel about the ratio of guest IOP to real IOPs? Or is QEMU blind to the real IOPs that correspond to a guest IOP? If QEMU is trully blind to the amount of real IOPs then couldn't QEMU driven throttling cause physical resources to be oversubscribed (underestimating the backend work it is creating)? Mike ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 20:48 ` Mike Snitzer 0 siblings, 0 replies; 56+ messages in thread From: Mike Snitzer @ 2011-05-31 20:48 UTC (permalink / raw) To: Anthony Liguori Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper, Vivek Goyal On Tue, May 31 2011 at 2:39pm -0400, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 05/31/2011 12:59 PM, Vivek Goyal wrote: > >On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote: > >>On 05/31/2011 09:04 AM, Vivek Goyal wrote: > >>>On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote: > >>>>On 05/31/2011 08:45 AM, Vivek Goyal wrote: > >>>>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >>>>>>Hello, all, > >>>>>> > >>>>>> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >>>>>> This feature will enable the user to cap disk I/O amount > performed by a VM.It is important for some storage resources to be > shared among multi-VMs. As you've known, if some of VMs are doing > excessive disk I/O, they will hurt the performance of other VMs. > >>>>>> > >>>>> > >>>>>Hi Zhiyong, > >>>>> > >>>>>Why not use kernel blkio controller for this and why reinvent the wheel > >>>>>and implement the feature again in qemu? > >>>> > >>>>blkio controller only works for block devices. It doesn't work when > >>>>using files. > >>> > >>>So can't we comeup with something to easily determine which device backs > >>>up this file? Though that will still not work for NFS backed storage > >>>though. > >> > >>Right. > >> > >>Additionally, in QEMU, we can rate limit based on concepts that make > >>sense to a guest. We can limit the actual I/O ops visible to the > >>guest which means that we'll get consistent performance regardless > >>of whether the backing file is qcow2, raw, LVM, or raw over NFS. > >> > > > >Are you referring to merging taking place which can change the definition > >of IOPS as seen by guest? > > No, with qcow2, it may take multiple real IOPs for what the guest > sees as an IOP. > > That's really the main argument I'm making here. The only entity > that knows what a guest IOP corresponds to is QEMU. On the backend, > it may end up being a network request, multiple BIOs to physical > disks, file access, etc. Couldn't QEMU give a hint to the kernel about the ratio of guest IOP to real IOPs? Or is QEMU blind to the real IOPs that correspond to a guest IOP? If QEMU is trully blind to the amount of real IOPs then couldn't QEMU driven throttling cause physical resources to be oversubscribed (underestimating the backend work it is creating)? Mike ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 20:48 ` [Qemu-devel] " Mike Snitzer (?) @ 2011-05-31 22:22 ` Anthony Liguori -1 siblings, 0 replies; 56+ messages in thread From: Anthony Liguori @ 2011-05-31 22:22 UTC (permalink / raw) To: Mike Snitzer Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper, Vivek Goyal On 05/31/2011 03:48 PM, Mike Snitzer wrote: > On Tue, May 31 2011 at 2:39pm -0400, > Anthony Liguori<anthony@codemonkey.ws> wrote: > >>> Are you referring to merging taking place which can change the definition >>> of IOPS as seen by guest? >> >> No, with qcow2, it may take multiple real IOPs for what the guest >> sees as an IOP. >> >> That's really the main argument I'm making here. The only entity >> that knows what a guest IOP corresponds to is QEMU. On the backend, >> it may end up being a network request, multiple BIOs to physical >> disks, file access, etc. > > Couldn't QEMU give a hint to the kernel about the ratio of guest IOP to > real IOPs? Or is QEMU blind to the real IOPs that correspond to a guest > IOP? Perhaps, but how does that work when the disk image is backed by NFS? And even if you had a VFS level API, we can do things like libcurl based block devices in QEMU. So unless you tried to do level 5 traffic throttling which hopefully, you'll agree is total overkill, we're going to need to have this functionality in QEMU no matter what. Regards, Anthony Liguori ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-31 13:45 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal @ 2011-05-31 13:56 ` Daniel P. Berrange -1 siblings, 0 replies; 56+ messages in thread From: Daniel P. Berrange @ 2011-05-31 13:56 UTC (permalink / raw) To: Vivek Goyal Cc: Zhi Yong Wu, qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > Hello, all, > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > Hi Zhiyong, > > Why not use kernel blkio controller for this and why reinvent the wheel > and implement the feature again in qemu? The finest level of granularity offered by cgroups apply limits per QEMU process. So the blkio controller can't be used to apply controls directly to individual disks used by QEMU, only the VM as a whole. We networking we can use 'net_cls' cgroups controller for the process as a whole, or attach 'tc' to individual TAP devices for per-NIC throttling, both of which ultimately use the same kernel functionality. I don't see an equivalent option for throttling individual disks that would reuse functionality from the blkio controller. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 13:56 ` Daniel P. Berrange 0 siblings, 0 replies; 56+ messages in thread From: Daniel P. Berrange @ 2011-05-31 13:56 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > Hello, all, > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > Hi Zhiyong, > > Why not use kernel blkio controller for this and why reinvent the wheel > and implement the feature again in qemu? The finest level of granularity offered by cgroups apply limits per QEMU process. So the blkio controller can't be used to apply controls directly to individual disks used by QEMU, only the VM as a whole. We networking we can use 'net_cls' cgroups controller for the process as a whole, or attach 'tc' to individual TAP devices for per-NIC throttling, both of which ultimately use the same kernel functionality. I don't see an equivalent option for throttling individual disks that would reuse functionality from the blkio controller. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-31 13:56 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange @ 2011-05-31 14:10 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 14:10 UTC (permalink / raw) To: Daniel P. Berrange Cc: Zhi Yong Wu, qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > Hello, all, > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > Hi Zhiyong, > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > and implement the feature again in qemu? > > The finest level of granularity offered by cgroups apply limits per QEMU > process. So the blkio controller can't be used to apply controls directly > to individual disks used by QEMU, only the VM as a whole. So are multiple VMs using same disk. Then put multiple VMs in same cgroup and apply the limit on that disk. Or if you want to put a system wide limit on a disk, then put all VMs in root cgroup and put limit on root cgroups. I fail to understand what's the exact requirement here. I thought the biggest use case was isolation one VM from other which might be sharing same device. Hence we were interested in putting per VM limit on disk and not a system wide limit on disk (independent of VM). Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 14:10 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 14:10 UTC (permalink / raw) To: Daniel P. Berrange Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > Hello, all, > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > Hi Zhiyong, > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > and implement the feature again in qemu? > > The finest level of granularity offered by cgroups apply limits per QEMU > process. So the blkio controller can't be used to apply controls directly > to individual disks used by QEMU, only the VM as a whole. So are multiple VMs using same disk. Then put multiple VMs in same cgroup and apply the limit on that disk. Or if you want to put a system wide limit on a disk, then put all VMs in root cgroup and put limit on root cgroups. I fail to understand what's the exact requirement here. I thought the biggest use case was isolation one VM from other which might be sharing same device. Hence we were interested in putting per VM limit on disk and not a system wide limit on disk (independent of VM). Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-31 14:10 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal @ 2011-05-31 14:19 ` Daniel P. Berrange -1 siblings, 0 replies; 56+ messages in thread From: Daniel P. Berrange @ 2011-05-31 14:19 UTC (permalink / raw) To: Vivek Goyal Cc: Zhi Yong Wu, qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim On Tue, May 31, 2011 at 10:10:37AM -0400, Vivek Goyal wrote: > On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > > Hello, all, > > > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > > > > Hi Zhiyong, > > > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > > and implement the feature again in qemu? > > > > The finest level of granularity offered by cgroups apply limits per QEMU > > process. So the blkio controller can't be used to apply controls directly > > to individual disks used by QEMU, only the VM as a whole. > > So are multiple VMs using same disk. Then put multiple VMs in same > cgroup and apply the limit on that disk. > > Or if you want to put a system wide limit on a disk, then put all > VMs in root cgroup and put limit on root cgroups. > > I fail to understand what's the exact requirement here. I thought > the biggest use case was isolation one VM from other which might > be sharing same device. Hence we were interested in putting > per VM limit on disk and not a system wide limit on disk (independent > of VM). No, it isn't about putting limits on a disk independant of a VM. It is about one VM having multiple disks, and wanting to set different policies for each of its virtual disks. eg qemu-kvm -drive file=/dev/sda1 -drive file=/dev/sdb3 and wanting to say that sda1 is limited to 10 MB/s, while sdb3 is limited to 50 MB/s. You can't do that kind of thing with cgroups, because it can only control the entire process, not individual resources within the process. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 14:19 ` Daniel P. Berrange 0 siblings, 0 replies; 56+ messages in thread From: Daniel P. Berrange @ 2011-05-31 14:19 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 10:10:37AM -0400, Vivek Goyal wrote: > On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > > Hello, all, > > > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > > > > Hi Zhiyong, > > > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > > and implement the feature again in qemu? > > > > The finest level of granularity offered by cgroups apply limits per QEMU > > process. So the blkio controller can't be used to apply controls directly > > to individual disks used by QEMU, only the VM as a whole. > > So are multiple VMs using same disk. Then put multiple VMs in same > cgroup and apply the limit on that disk. > > Or if you want to put a system wide limit on a disk, then put all > VMs in root cgroup and put limit on root cgroups. > > I fail to understand what's the exact requirement here. I thought > the biggest use case was isolation one VM from other which might > be sharing same device. Hence we were interested in putting > per VM limit on disk and not a system wide limit on disk (independent > of VM). No, it isn't about putting limits on a disk independant of a VM. It is about one VM having multiple disks, and wanting to set different policies for each of its virtual disks. eg qemu-kvm -drive file=/dev/sda1 -drive file=/dev/sdb3 and wanting to say that sda1 is limited to 10 MB/s, while sdb3 is limited to 50 MB/s. You can't do that kind of thing with cgroups, because it can only control the entire process, not individual resources within the process. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-31 14:19 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange @ 2011-05-31 14:28 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 14:28 UTC (permalink / raw) To: Daniel P. Berrange Cc: Zhi Yong Wu, qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim On Tue, May 31, 2011 at 03:19:56PM +0100, Daniel P. Berrange wrote: > On Tue, May 31, 2011 at 10:10:37AM -0400, Vivek Goyal wrote: > > On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > > > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > > > Hello, all, > > > > > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > > > > > > > Hi Zhiyong, > > > > > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > > > and implement the feature again in qemu? > > > > > > The finest level of granularity offered by cgroups apply limits per QEMU > > > process. So the blkio controller can't be used to apply controls directly > > > to individual disks used by QEMU, only the VM as a whole. > > > > So are multiple VMs using same disk. Then put multiple VMs in same > > cgroup and apply the limit on that disk. > > > > Or if you want to put a system wide limit on a disk, then put all > > VMs in root cgroup and put limit on root cgroups. > > > > I fail to understand what's the exact requirement here. I thought > > the biggest use case was isolation one VM from other which might > > be sharing same device. Hence we were interested in putting > > per VM limit on disk and not a system wide limit on disk (independent > > of VM). > > No, it isn't about putting limits on a disk independant of a VM. It is > about one VM having multiple disks, and wanting to set different policies > for each of its virtual disks. eg > > qemu-kvm -drive file=/dev/sda1 -drive file=/dev/sdb3 > > and wanting to say that sda1 is limited to 10 MB/s, while sdb3 is > limited to 50 MB/s. You can't do that kind of thing with cgroups, > because it can only control the entire process, not individual > resources within the process. With IO controller you can do that. Limits are "per cgroup per disk". So once you have put a VM in a cgroup, you can specify two differnt limits for two disk for that cgroup. There are 4 relevant files per cgroup. blkio.throttle.read_bps_device blkio.throttle.write_bps_device blkio.throttle.read_iops_device blkio.throttle.write_iops_device And syntax of these files is. device_major:device_minor <rate_limit> Ex. 8:16 1024000 This means from a specified cgroup, on disk with major:minor 8:16, don't allow read BW higher than 1024000 bytes per second. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 14:28 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 14:28 UTC (permalink / raw) To: Daniel P. Berrange Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 03:19:56PM +0100, Daniel P. Berrange wrote: > On Tue, May 31, 2011 at 10:10:37AM -0400, Vivek Goyal wrote: > > On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > > > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > > > Hello, all, > > > > > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > > > > > > > Hi Zhiyong, > > > > > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > > > and implement the feature again in qemu? > > > > > > The finest level of granularity offered by cgroups apply limits per QEMU > > > process. So the blkio controller can't be used to apply controls directly > > > to individual disks used by QEMU, only the VM as a whole. > > > > So are multiple VMs using same disk. Then put multiple VMs in same > > cgroup and apply the limit on that disk. > > > > Or if you want to put a system wide limit on a disk, then put all > > VMs in root cgroup and put limit on root cgroups. > > > > I fail to understand what's the exact requirement here. I thought > > the biggest use case was isolation one VM from other which might > > be sharing same device. Hence we were interested in putting > > per VM limit on disk and not a system wide limit on disk (independent > > of VM). > > No, it isn't about putting limits on a disk independant of a VM. It is > about one VM having multiple disks, and wanting to set different policies > for each of its virtual disks. eg > > qemu-kvm -drive file=/dev/sda1 -drive file=/dev/sdb3 > > and wanting to say that sda1 is limited to 10 MB/s, while sdb3 is > limited to 50 MB/s. You can't do that kind of thing with cgroups, > because it can only control the entire process, not individual > resources within the process. With IO controller you can do that. Limits are "per cgroup per disk". So once you have put a VM in a cgroup, you can specify two differnt limits for two disk for that cgroup. There are 4 relevant files per cgroup. blkio.throttle.read_bps_device blkio.throttle.write_bps_device blkio.throttle.read_iops_device blkio.throttle.write_iops_device And syntax of these files is. device_major:device_minor <rate_limit> Ex. 8:16 1024000 This means from a specified cgroup, on disk with major:minor 8:16, don't allow read BW higher than 1024000 bytes per second. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-31 14:19 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange @ 2011-05-31 15:28 ` Ryan Harper -1 siblings, 0 replies; 56+ messages in thread From: Ryan Harper @ 2011-05-31 15:28 UTC (permalink / raw) To: Daniel P. Berrange Cc: Vivek Goyal, Zhi Yong Wu, qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, Anthony Liguori, Ryan A Harper, luowenj, wuzhy, zhanx, zhaoyang, llim * Daniel P. Berrange <berrange@redhat.com> [2011-05-31 09:25]: > On Tue, May 31, 2011 at 10:10:37AM -0400, Vivek Goyal wrote: > > On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > > > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > > > Hello, all, > > > > > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > > > > > > > Hi Zhiyong, > > > > > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > > > and implement the feature again in qemu? > > > > > > The finest level of granularity offered by cgroups apply limits per QEMU > > > process. So the blkio controller can't be used to apply controls directly > > > to individual disks used by QEMU, only the VM as a whole. > > > > So are multiple VMs using same disk. Then put multiple VMs in same > > cgroup and apply the limit on that disk. > > > > Or if you want to put a system wide limit on a disk, then put all > > VMs in root cgroup and put limit on root cgroups. > > > > I fail to understand what's the exact requirement here. I thought > > the biggest use case was isolation one VM from other which might > > be sharing same device. Hence we were interested in putting > > per VM limit on disk and not a system wide limit on disk (independent > > of VM). > > No, it isn't about putting limits on a disk independant of a VM. It is > about one VM having multiple disks, and wanting to set different policies > for each of its virtual disks. eg > > qemu-kvm -drive file=/dev/sda1 -drive file=/dev/sdb3 > > and wanting to say that sda1 is limited to 10 MB/s, while sdb3 is > limited to 50 MB/s. You can't do that kind of thing with cgroups, > because it can only control the entire process, not individual > resources within the process. yes, but with files: qemu-kvm -drive file=/path/to/local/vm/images -drive file=/path/to/shared/storage -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ryanh@us.ibm.com ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 15:28 ` Ryan Harper 0 siblings, 0 replies; 56+ messages in thread From: Ryan Harper @ 2011-05-31 15:28 UTC (permalink / raw) To: Daniel P. Berrange Cc: kwolf, Anthony Liguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, Zhi Yong Wu, luowenj, zhanx, zhaoyang, llim, Ryan A Harper, Vivek Goyal * Daniel P. Berrange <berrange@redhat.com> [2011-05-31 09:25]: > On Tue, May 31, 2011 at 10:10:37AM -0400, Vivek Goyal wrote: > > On Tue, May 31, 2011 at 02:56:46PM +0100, Daniel P. Berrange wrote: > > > On Tue, May 31, 2011 at 09:45:37AM -0400, Vivek Goyal wrote: > > > > On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > > > Hello, all, > > > > > > > > > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > > > > > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > > > > > > > > > > > > Hi Zhiyong, > > > > > > > > Why not use kernel blkio controller for this and why reinvent the wheel > > > > and implement the feature again in qemu? > > > > > > The finest level of granularity offered by cgroups apply limits per QEMU > > > process. So the blkio controller can't be used to apply controls directly > > > to individual disks used by QEMU, only the VM as a whole. > > > > So are multiple VMs using same disk. Then put multiple VMs in same > > cgroup and apply the limit on that disk. > > > > Or if you want to put a system wide limit on a disk, then put all > > VMs in root cgroup and put limit on root cgroups. > > > > I fail to understand what's the exact requirement here. I thought > > the biggest use case was isolation one VM from other which might > > be sharing same device. Hence we were interested in putting > > per VM limit on disk and not a system wide limit on disk (independent > > of VM). > > No, it isn't about putting limits on a disk independant of a VM. It is > about one VM having multiple disks, and wanting to set different policies > for each of its virtual disks. eg > > qemu-kvm -drive file=/dev/sda1 -drive file=/dev/sdb3 > > and wanting to say that sda1 is limited to 10 MB/s, while sdb3 is > limited to 50 MB/s. You can't do that kind of thing with cgroups, > because it can only control the entire process, not individual > resources within the process. yes, but with files: qemu-kvm -drive file=/path/to/local/vm/images -drive file=/path/to/shared/storage -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx ryanh@us.ibm.com ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel][RFC]QEMU disk I/O limits 2011-05-30 5:09 ` [Qemu-devel] [RFC]QEMU " Zhi Yong Wu @ 2011-05-31 19:55 ` Vivek Goyal -1 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 19:55 UTC (permalink / raw) To: Zhi Yong Wu Cc: qemu-devel, kvm, kwolf, guijianfeng, herbert, stefanha, aliguori, raharper, luowenj, wuzhy, zhanx, zhaoyang, llim On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: [..] > 3.) How the users enable and play with it > QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. How does throughput interface look like? is it bytes per second or something else? Do we have read and write variants for throughput as we have for iops. if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" might be good names too for thoughput interface. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-05-31 19:55 ` Vivek Goyal 0 siblings, 0 replies; 56+ messages in thread From: Vivek Goyal @ 2011-05-31 19:55 UTC (permalink / raw) To: Zhi Yong Wu Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, luowenj, zhanx, zhaoyang, llim, raharper On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: [..] > 3.) How the users enable and play with it > QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. How does throughput interface look like? is it bytes per second or something else? Do we have read and write variants for throughput as we have for iops. if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" might be good names too for thoughput interface. Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 19:55 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal @ 2011-06-01 3:12 ` Zhi Yong Wu -1 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-01 3:12 UTC (permalink / raw) To: Vivek Goyal Cc: vgoyal, kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >Date: Tue, 31 May 2011 15:55:49 -0400 >From: Vivek Goyal <vgoyal@redhat.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, > herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, > zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >User-Agent: Mutt/1.5.21 (2010-09-15) > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >[..] >> 3.) How the users enable and play with it >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > >How does throughput interface look like? is it bytes per second or something >else? HI, Vivek, It will be a value based on bytes per second. > >Do we have read and write variants for throughput as we have for iops. QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get their bytes per second. > >if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >might be good names too for thoughput interface. I agree with you, and can change them as your suggestions. Regards, Zhiyong Wu > >Thanks >Vivek > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-01 3:12 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-01 3:12 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >Date: Tue, 31 May 2011 15:55:49 -0400 >From: Vivek Goyal <vgoyal@redhat.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, > herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, > zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >User-Agent: Mutt/1.5.21 (2010-09-15) > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >[..] >> 3.) How the users enable and play with it >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > >How does throughput interface look like? is it bytes per second or something >else? HI, Vivek, It will be a value based on bytes per second. > >Do we have read and write variants for throughput as we have for iops. QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get their bytes per second. > >if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >might be good names too for thoughput interface. I agree with you, and can change them as your suggestions. Regards, Zhiyong Wu > >Thanks >Vivek > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-01 3:12 ` Zhi Yong Wu @ 2011-06-02 9:33 ` Michal Suchanek -1 siblings, 0 replies; 56+ messages in thread From: Michal Suchanek @ 2011-06-02 9:33 UTC (permalink / raw) To: Zhi Yong Wu Cc: Vivek Goyal, kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, luowenj, zhanx, zhaoyang, llim, raharper On 1 June 2011 05:12, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> wrote: > On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >>Date: Tue, 31 May 2011 15:55:49 -0400 >>From: Vivek Goyal <vgoyal@redhat.com> >>To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, >> kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, >> qemu-devel@nongnu.org, wuzhy@cn.ibm.com, >> herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, >> zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >>Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >>User-Agent: Mutt/1.5.21 (2010-09-15) >> >>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >> >>[..] >>> 3.) How the users enable and play with it >>> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >> >>How does throughput interface look like? is it bytes per second or something >>else? > HI, Vivek, > It will be a value based on bytes per second. > >> >>Do we have read and write variants for throughput as we have for iops. > QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get their bytes per second. > >> >>if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >>might be good names too for thoughput interface. > I agree with you, and can change them as your suggestions. > Changing them this way is not going to be an improvement. While rd_bytes and wr_bytes lack the time interval specification bps_rd and bps_wr is ambiguous. Is that bits? bytes? Sure, there should be some distinction by capitalization but that does not apply since qemu arguments are all lowercase. Thanks Michal ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-02 9:33 ` Michal Suchanek 0 siblings, 0 replies; 56+ messages in thread From: Michal Suchanek @ 2011-06-02 9:33 UTC (permalink / raw) To: Zhi Yong Wu Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, luowenj, zhanx, zhaoyang, llim, raharper, Vivek Goyal On 1 June 2011 05:12, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> wrote: > On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >>Date: Tue, 31 May 2011 15:55:49 -0400 >>From: Vivek Goyal <vgoyal@redhat.com> >>To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, >> kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, >> qemu-devel@nongnu.org, wuzhy@cn.ibm.com, >> herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, >> zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >>Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >>User-Agent: Mutt/1.5.21 (2010-09-15) >> >>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >> >>[..] >>> 3.) How the users enable and play with it >>> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >> >>How does throughput interface look like? is it bytes per second or something >>else? > HI, Vivek, > It will be a value based on bytes per second. > >> >>Do we have read and write variants for throughput as we have for iops. > QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get their bytes per second. > >> >>if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >>might be good names too for thoughput interface. > I agree with you, and can change them as your suggestions. > Changing them this way is not going to be an improvement. While rd_bytes and wr_bytes lack the time interval specification bps_rd and bps_wr is ambiguous. Is that bits? bytes? Sure, there should be some distinction by capitalization but that does not apply since qemu arguments are all lowercase. Thanks Michal ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-02 9:33 ` Michal Suchanek @ 2011-06-03 6:56 ` Zhi Yong Wu -1 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-03 6:56 UTC (permalink / raw) To: Michal Suchanek Cc: Vivek Goyal, kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, luowenj, raharper On Thu, Jun 2, 2011 at 5:33 PM, Michal Suchanek <hramrach@centrum.cz> wrote: > On 1 June 2011 05:12, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> wrote: >> On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >>>Date: Tue, 31 May 2011 15:55:49 -0400 >>>From: Vivek Goyal <vgoyal@redhat.com> >>>To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>>Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, >>> kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, >>> qemu-devel@nongnu.org, wuzhy@cn.ibm.com, >>> herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, >>> zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >>>Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >>>User-Agent: Mutt/1.5.21 (2010-09-15) >>> >>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >>> >>>[..] >>>> 3.) How the users enable and play with it >>>> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >>> >>>How does throughput interface look like? is it bytes per second or something >>>else? >> HI, Vivek, >> It will be a value based on bytes per second. >> >>> >>>Do we have read and write variants for throughput as we have for iops. >> QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get their bytes per second. >> >>> >>>if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >>>might be good names too for thoughput interface. >> I agree with you, and can change them as your suggestions. >> > > Changing them this way is not going to be an improvement. While > rd_bytes and wr_bytes lack the time interval specification bps_rd and right, rd_bytes and wr_bytes lack. > bps_wr is ambiguous. Is that bits? bytes? Sure, there should be some if we implement them, they will be bytes. > distinction by capitalization but that does not apply since qemu > arguments are all lowercase. Michal, maybe you misunderstand what i mean. I mean that two variables rd_bytes and wr_bytes exist in block.c file, and are not qemu arguments. But bps_rd and bps_wr wil be added as qemu arguments. Regards, Zhiyong Wu > > Thanks > > Michal > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-03 6:56 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-03 6:56 UTC (permalink / raw) To: Michal Suchanek Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, luowenj, raharper, Vivek Goyal On Thu, Jun 2, 2011 at 5:33 PM, Michal Suchanek <hramrach@centrum.cz> wrote: > On 1 June 2011 05:12, Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> wrote: >> On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >>>Date: Tue, 31 May 2011 15:55:49 -0400 >>>From: Vivek Goyal <vgoyal@redhat.com> >>>To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >>>Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, >>> kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, >>> qemu-devel@nongnu.org, wuzhy@cn.ibm.com, >>> herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, >>> zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >>>Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >>>User-Agent: Mutt/1.5.21 (2010-09-15) >>> >>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >>> >>>[..] >>>> 3.) How the users enable and play with it >>>> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >>> >>>How does throughput interface look like? is it bytes per second or something >>>else? >> HI, Vivek, >> It will be a value based on bytes per second. >> >>> >>>Do we have read and write variants for throughput as we have for iops. >> QEMU code has two variants "rd_bytes, wr_bytes", but we maybe need to get their bytes per second. >> >>> >>>if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >>>might be good names too for thoughput interface. >> I agree with you, and can change them as your suggestions. >> > > Changing them this way is not going to be an improvement. While > rd_bytes and wr_bytes lack the time interval specification bps_rd and right, rd_bytes and wr_bytes lack. > bps_wr is ambiguous. Is that bits? bytes? Sure, there should be some if we implement them, they will be bytes. > distinction by capitalization but that does not apply since qemu > arguments are all lowercase. Michal, maybe you misunderstand what i mean. I mean that two variables rd_bytes and wr_bytes exist in block.c file, and are not qemu arguments. But bps_rd and bps_wr wil be added as qemu arguments. Regards, Zhiyong Wu > > Thanks > > Michal > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Regards, Zhi Yong Wu ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-31 19:55 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal @ 2011-06-01 3:19 ` Zhi Yong Wu -1 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-01 3:19 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper, vgoyal On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >Date: Tue, 31 May 2011 15:55:49 -0400 >From: Vivek Goyal <vgoyal@redhat.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, > herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, > zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >User-Agent: Mutt/1.5.21 (2010-09-15) > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >[..] >> 3.) How the users enable and play with it >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > >How does throughput interface look like? is it bytes per second or something >else? Given your suggestion, its form will look like below: -drive [iops=xxx][,bps=xxx] or -drive [iops_rd=xxx][,iops_wr=xxx][,bps_rd=xxx][,bps_wr=xxx] Regards, Zhiyong Wu > >Do we have read and write variants for throughput as we have for iops. > >if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >might be good names too for thoughput interface. > >Thanks >Vivek > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-01 3:19 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-01 3:19 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >Date: Tue, 31 May 2011 15:55:49 -0400 >From: Vivek Goyal <vgoyal@redhat.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, > herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, > zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >User-Agent: Mutt/1.5.21 (2010-09-15) > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > >[..] >> 3.) How the users enable and play with it >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > >How does throughput interface look like? is it bytes per second or something >else? Given your suggestion, its form will look like below: -drive [iops=xxx][,bps=xxx] or -drive [iops_rd=xxx][,iops_wr=xxx][,bps_rd=xxx][,bps_wr=xxx] Regards, Zhiyong Wu > >Do we have read and write variants for throughput as we have for iops. > >if you have bytes interface(as kenrel does), then "bps_rd" and "bps_wr" >might be good names too for thoughput interface. > >Thanks >Vivek > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-01 3:19 ` Zhi Yong Wu (?) @ 2011-06-01 13:32 ` Vivek Goyal 2011-06-02 6:07 ` Zhi Yong Wu -1 siblings, 1 reply; 56+ messages in thread From: Vivek Goyal @ 2011-06-01 13:32 UTC (permalink / raw) To: Zhi Yong Wu Cc: kwolf, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, herbert, ejt, luowenj, zhanx, zhaoyang, llim, raharper On Wed, Jun 01, 2011 at 11:19:58AM +0800, Zhi Yong Wu wrote: > On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: > >Date: Tue, 31 May 2011 15:55:49 -0400 > >From: Vivek Goyal <vgoyal@redhat.com> > >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > >Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, > > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, > > herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, > > zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com > >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits > >User-Agent: Mutt/1.5.21 (2010-09-15) > > > >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: > > > >[..] > >> 3.) How the users enable and play with it > >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > > > >How does throughput interface look like? is it bytes per second or something > >else? > Given your suggestion, its form will look like below: > > -drive [iops=xxx][,bps=xxx] or -drive [iops_rd=xxx][,iops_wr=xxx][,bps_rd=xxx][,bps_wr=xxx] Can one specify both iops and bps rule for the same drive? Thanks Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-01 13:32 ` Vivek Goyal @ 2011-06-02 6:07 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-02 6:07 UTC (permalink / raw) To: Vivek Goyal Cc: kwolf, stefanha, Mike Snitzer, guijianfeng, qemu-devel, wuzhy, herbert, Joe Thornber, Zhi Yong Wu, luowenj, kvm, zhanx, zhaoyang, llim, Ryan A Harper On Wed, Jun 01, 2011 at 09:32:32AM -0400, Vivek Goyal wrote: >Date: Wed, 1 Jun 2011 09:32:32 -0400 >From: Vivek Goyal <vgoyal@redhat.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, stefanha@linux.vnet.ibm.com, kvm@vger.kernel.org, > guijianfeng@cn.fujitsu.com, qemu-devel@nongnu.org, wuzhy@cn.ibm.com, > herbert@gondor.hengli.com.au, ejt@redhat.com, luowenj@cn.ibm.com, > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, > raharper@us.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >User-Agent: Mutt/1.5.21 (2010-09-15) > >On Wed, Jun 01, 2011 at 11:19:58AM +0800, Zhi Yong Wu wrote: >> On Tue, May 31, 2011 at 03:55:49PM -0400, Vivek Goyal wrote: >> >Date: Tue, 31 May 2011 15:55:49 -0400 >> >From: Vivek Goyal <vgoyal@redhat.com> >> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, >> > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, >> > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, >> > herbert@gondor.hengli.com.au, luowenj@cn.ibm.com, zhanx@cn.ibm.com, >> > zhaoyang@cn.ibm.com, llim@redhat.com, raharper@us.ibm.com >> >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >> >User-Agent: Mutt/1.5.21 (2010-09-15) >> > >> >On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote: >> > >> >[..] >> >> 3.) How the users enable and play with it >> >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >> > >> >How does throughput interface look like? is it bytes per second or something >> >else? >> Given your suggestion, its form will look like below: >> >> -drive [iops=xxx][,bps=xxx] or -drive [iops_rd=xxx][,iops_wr=xxx][,bps_rd=xxx][,bps_wr=xxx] > >Can one specify both iops and bps rule for the same drive? Right. They both will together limit runtime I/O rate. Regards, Zhiyong Wu > >Thanks >Vivek ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-05-30 5:09 ` [Qemu-devel] [RFC]QEMU " Zhi Yong Wu @ 2011-06-02 6:17 ` Sasha Levin -1 siblings, 0 replies; 56+ messages in thread From: Sasha Levin @ 2011-06-02 6:17 UTC (permalink / raw) To: Zhi Yong Wu Cc: qemu-devel, kvm, kwolf, aliguori, herbert, guijianfeng, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper, vgoyal, stefanha Hi, On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote: > Hello, all, > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > More detail is available here: > http://wiki.qemu.org/Features/DiskIOLimits > > 1.) Why we need per-drive disk I/O limits > As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. > > Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. > > 2.) How disk I/O limits will be implemented > QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. > In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. > > 3.) How the users enable and play with it > QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. I'm wondering if you've considered adding a 'burst' parameter - something which will not limit (or limit less) the io ops or the throughput for the first 'x' ms in a given time window. > Regards, > > Zhiyong Wu > -- Sasha. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-02 6:17 ` Sasha Levin 0 siblings, 0 replies; 56+ messages in thread From: Sasha Levin @ 2011-06-02 6:17 UTC (permalink / raw) To: Zhi Yong Wu Cc: kwolf, aliguori, herbert, kvm, guijianfeng, qemu-devel, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper, vgoyal, stefanha Hi, On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote: > Hello, all, > > I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > > More detail is available here: > http://wiki.qemu.org/Features/DiskIOLimits > > 1.) Why we need per-drive disk I/O limits > As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. > > Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. > > 2.) How disk I/O limits will be implemented > QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. > In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. > > 3.) How the users enable and play with it > QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. I'm wondering if you've considered adding a 'burst' parameter - something which will not limit (or limit less) the io ops or the throughput for the first 'x' ms in a given time window. > Regards, > > Zhiyong Wu > -- Sasha. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-02 6:17 ` Sasha Levin (?) @ 2011-06-02 6:29 ` Zhi Yong Wu 2011-06-02 7:15 ` Sasha Levin -1 siblings, 1 reply; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-02 6:29 UTC (permalink / raw) To: Sasha Levin Cc: kwolf, aliguori, herbert, kvm, guijianfeng, qemu-devel, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper, vgoyal, stefanha On Thu, Jun 02, 2011 at 09:17:06AM +0300, Sasha Levin wrote: >Date: Thu, 02 Jun 2011 09:17:06 +0300 >From: Sasha Levin <levinsasha928@gmail.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, kwolf@redhat.com, > aliguori@us.ibm.com, herbert@gondor.apana.org.au, > guijianfeng@cn.fujitsu.com, wuzhy@cn.ibm.com, luowenj@cn.ibm.com, > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, > raharper@us.ibm.com, vgoyal@redhat.com, stefanha@linux.vnet.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >X-Mailer: Evolution 2.32.2 > >Hi, > >On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote: >> Hello, all, >> >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >> >> More detail is available here: >> http://wiki.qemu.org/Features/DiskIOLimits >> >> 1.) Why we need per-drive disk I/O limits >> As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. >> >> Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. >> >> 2.) How disk I/O limits will be implemented >> QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. >> In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. >> >> 3.) How the users enable and play with it >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >> The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. > >I'm wondering if you've considered adding a 'burst' parameter - >something which will not limit (or limit less) the io ops or the >throughput for the first 'x' ms in a given time window. Currently no, Do you let us know what scenario it will make sense to? Regards, Zhiyong Wu > >> Regards, >> >> Zhiyong Wu >> > >-- > >Sasha. > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-02 6:29 ` Zhi Yong Wu @ 2011-06-02 7:15 ` Sasha Levin 2011-06-02 8:18 ` Zhi Yong Wu 0 siblings, 1 reply; 56+ messages in thread From: Sasha Levin @ 2011-06-02 7:15 UTC (permalink / raw) To: Zhi Yong Wu Cc: kwolf, aliguori, herbert, kvm, guijianfeng, qemu-devel, wuzhy, luowenj, zhanx, zhaoyang, llim, raharper, vgoyal, stefanha On Thu, 2011-06-02 at 14:29 +0800, Zhi Yong Wu wrote: > On Thu, Jun 02, 2011 at 09:17:06AM +0300, Sasha Levin wrote: > >Date: Thu, 02 Jun 2011 09:17:06 +0300 > >From: Sasha Levin <levinsasha928@gmail.com> > >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> > >Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, kwolf@redhat.com, > > aliguori@us.ibm.com, herbert@gondor.apana.org.au, > > guijianfeng@cn.fujitsu.com, wuzhy@cn.ibm.com, luowenj@cn.ibm.com, > > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, > > raharper@us.ibm.com, vgoyal@redhat.com, stefanha@linux.vnet.ibm.com > >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits > >X-Mailer: Evolution 2.32.2 > > > >Hi, > > > >On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote: > >> Hello, all, > >> > >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. > >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. > >> > >> More detail is available here: > >> http://wiki.qemu.org/Features/DiskIOLimits > >> > >> 1.) Why we need per-drive disk I/O limits > >> As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. > >> > >> Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. > >> > >> 2.) How disk I/O limits will be implemented > >> QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. > >> In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. > >> > >> 3.) How the users enable and play with it > >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. > >> The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. > > > >I'm wondering if you've considered adding a 'burst' parameter - > >something which will not limit (or limit less) the io ops or the > >throughput for the first 'x' ms in a given time window. > Currently no, Do you let us know what scenario it will make sense to? My assumption is that most guests are not doing constant disk I/O access. Instead, the operations are usually short and happen on small scale (relatively small amount of bytes accessed). For example: Multiple table DB lookup, serving a website, file servers. Basically, if I need to do a DB lookup which needs 50MB of data from a disk which is limited to 10MB/s, I'd rather let it burst for 1 second and complete the lookup faster instead of having it read data for 5 seconds. If the guest now starts running multiple lookups one after the other, thats when I would like to limit. > Regards, > > Zhiyong Wu > > > >> Regards, > >> > >> Zhiyong Wu > >> > > > >-- > > > >Sasha. > > -- Sasha. ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits 2011-06-02 7:15 ` Sasha Levin @ 2011-06-02 8:18 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-02 8:18 UTC (permalink / raw) To: Sasha Levin Cc: kwolf, aliguori, kvm, guijianfeng, qemu-devel, wuzhy, luowenj, stefanha On Thu, Jun 02, 2011 at 10:15:02AM +0300, Sasha Levin wrote: >Date: Thu, 02 Jun 2011 10:15:02 +0300 >From: Sasha Levin <levinsasha928@gmail.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, herbert@gondor.apana.org.au, > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, luowenj@cn.ibm.com, > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, > raharper@us.ibm.com, vgoyal@redhat.com, stefanha@linux.vnet.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >X-Mailer: Evolution 2.32.2 > >On Thu, 2011-06-02 at 14:29 +0800, Zhi Yong Wu wrote: >> On Thu, Jun 02, 2011 at 09:17:06AM +0300, Sasha Levin wrote: >> >Date: Thu, 02 Jun 2011 09:17:06 +0300 >> >From: Sasha Levin <levinsasha928@gmail.com> >> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> >Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, kwolf@redhat.com, >> > aliguori@us.ibm.com, herbert@gondor.apana.org.au, >> > guijianfeng@cn.fujitsu.com, wuzhy@cn.ibm.com, luowenj@cn.ibm.com, >> > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, >> > raharper@us.ibm.com, vgoyal@redhat.com, stefanha@linux.vnet.ibm.com >> >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >> >X-Mailer: Evolution 2.32.2 >> > >> >Hi, >> > >> >On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote: >> >> Hello, all, >> >> >> >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >> >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >> >> >> >> More detail is available here: >> >> http://wiki.qemu.org/Features/DiskIOLimits >> >> >> >> 1.) Why we need per-drive disk I/O limits >> >> As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. >> >> >> >> Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. >> >> >> >> 2.) How disk I/O limits will be implemented >> >> QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. >> >> In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. >> >> >> >> 3.) How the users enable and play with it >> >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >> >> The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. >> > >> >I'm wondering if you've considered adding a 'burst' parameter - >> >something which will not limit (or limit less) the io ops or the >> >throughput for the first 'x' ms in a given time window. >> Currently no, Do you let us know what scenario it will make sense to? > >My assumption is that most guests are not doing constant disk I/O >access. Instead, the operations are usually short and happen on small >scale (relatively small amount of bytes accessed). > >For example: Multiple table DB lookup, serving a website, file servers. > >Basically, if I need to do a DB lookup which needs 50MB of data from a >disk which is limited to 10MB/s, I'd rather let it burst for 1 second >and complete the lookup faster instead of having it read data for 5 >seconds. > >If the guest now starts running multiple lookups one after the other, >thats when I would like to limit. HI, Sasha, If iops or bps parameters are not specified to -drive, it will not limit this disk I/O rate. Of course, QMP commands will be extended to support changing or disabling disk I/O limits at runtime. If you'd like not limit a disk I/O rate, you can use it to disabled this feature. I don't make sure that this is the right answer for your question. Regards, Zhiyong Wu > >> Regards, >> >> Zhiyong Wu >> > >> >> Regards, >> >> >> >> Zhiyong Wu >> >> >> > >> >-- >> > >> >Sasha. >> > > >-- > >Sasha. > > ^ permalink raw reply [flat|nested] 56+ messages in thread
* Re: [Qemu-devel] [RFC]QEMU disk I/O limits @ 2011-06-02 8:18 ` Zhi Yong Wu 0 siblings, 0 replies; 56+ messages in thread From: Zhi Yong Wu @ 2011-06-02 8:18 UTC (permalink / raw) To: Sasha Levin Cc: kwolf, aliguori, stefanha, kvm, guijianfeng, qemu-devel, wuzhy, luowenj On Thu, Jun 02, 2011 at 10:15:02AM +0300, Sasha Levin wrote: >Date: Thu, 02 Jun 2011 10:15:02 +0300 >From: Sasha Levin <levinsasha928@gmail.com> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >Cc: kwolf@redhat.com, aliguori@us.ibm.com, herbert@gondor.apana.org.au, > kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com, > qemu-devel@nongnu.org, wuzhy@cn.ibm.com, luowenj@cn.ibm.com, > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, > raharper@us.ibm.com, vgoyal@redhat.com, stefanha@linux.vnet.ibm.com >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >X-Mailer: Evolution 2.32.2 > >On Thu, 2011-06-02 at 14:29 +0800, Zhi Yong Wu wrote: >> On Thu, Jun 02, 2011 at 09:17:06AM +0300, Sasha Levin wrote: >> >Date: Thu, 02 Jun 2011 09:17:06 +0300 >> >From: Sasha Levin <levinsasha928@gmail.com> >> >To: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> >> >Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, kwolf@redhat.com, >> > aliguori@us.ibm.com, herbert@gondor.apana.org.au, >> > guijianfeng@cn.fujitsu.com, wuzhy@cn.ibm.com, luowenj@cn.ibm.com, >> > zhanx@cn.ibm.com, zhaoyang@cn.ibm.com, llim@redhat.com, >> > raharper@us.ibm.com, vgoyal@redhat.com, stefanha@linux.vnet.ibm.com >> >Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits >> >X-Mailer: Evolution 2.32.2 >> > >> >Hi, >> > >> >On Mon, 2011-05-30 at 13:09 +0800, Zhi Yong Wu wrote: >> >> Hello, all, >> >> >> >> I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect. >> >> This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs. >> >> >> >> More detail is available here: >> >> http://wiki.qemu.org/Features/DiskIOLimits >> >> >> >> 1.) Why we need per-drive disk I/O limits >> >> As you've known, for linux, cgroup blkio-controller has supported I/O throttling on block devices. More importantly, there is no single mechanism for disk I/O throttling across all underlying storage types (image file, LVM, NFS, Ceph) and for some types there is no way to throttle at all. >> >> >> >> Disk I/O limits feature introduces QEMU block layer I/O limits together with command-line and QMP interfaces for configuring limits. This allows I/O limits to be imposed across all underlying storage types using a single interface. >> >> >> >> 2.) How disk I/O limits will be implemented >> >> QEMU block layer will introduce a per-drive disk I/O request queue for those disks whose "disk I/O limits" feature is enabled. It can control disk I/O limits individually for each disk when multiple disks are attached to a VM, and enable use cases like unlimited local disk access but shared storage access with limits. >> >> In mutliple I/O threads scenario, when an application in a VM issues a block I/O request, this request will be intercepted by QEMU block layer, then it will calculate disk runtime I/O rate and determine if it has go beyond its limits. If yes, this I/O request will enqueue to that introduced queue; otherwise it will be serviced. >> >> >> >> 3.) How the users enable and play with it >> >> QEMU -drive option will be extended so that disk I/O limits can be specified on its command line, such as -drive [iops=xxx,][throughput=xxx] or -drive [iops_rd=xxx,][iops_wr=xxx,][throughput=xxx] etc. When this argument is specified, it means that "disk I/O limits" feature is enabled for this drive disk. >> >> The feature will also provide users with the ability to change per-drive disk I/O limits at runtime using QMP commands. >> > >> >I'm wondering if you've considered adding a 'burst' parameter - >> >something which will not limit (or limit less) the io ops or the >> >throughput for the first 'x' ms in a given time window. >> Currently no, Do you let us know what scenario it will make sense to? > >My assumption is that most guests are not doing constant disk I/O >access. Instead, the operations are usually short and happen on small >scale (relatively small amount of bytes accessed). > >For example: Multiple table DB lookup, serving a website, file servers. > >Basically, if I need to do a DB lookup which needs 50MB of data from a >disk which is limited to 10MB/s, I'd rather let it burst for 1 second >and complete the lookup faster instead of having it read data for 5 >seconds. > >If the guest now starts running multiple lookups one after the other, >thats when I would like to limit. HI, Sasha, If iops or bps parameters are not specified to -drive, it will not limit this disk I/O rate. Of course, QMP commands will be extended to support changing or disabling disk I/O limits at runtime. If you'd like not limit a disk I/O rate, you can use it to disabled this feature. I don't make sure that this is the right answer for your question. Regards, Zhiyong Wu > >> Regards, >> >> Zhiyong Wu >> > >> >> Regards, >> >> >> >> Zhiyong Wu >> >> >> > >> >-- >> > >> >Sasha. >> > > >-- > >Sasha. > > ^ permalink raw reply [flat|nested] 56+ messages in thread
end of thread, other threads:[~2011-06-04 8:55 UTC | newest] Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-05-30 5:09 [Qemu-devel][RFC]QEMU disk I/O limits Zhi Yong Wu 2011-05-30 5:09 ` [Qemu-devel] [RFC]QEMU " Zhi Yong Wu 2011-05-31 13:45 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 13:45 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-05-31 13:50 ` Anthony Liguori 2011-05-31 13:50 ` [Qemu-devel] " Anthony Liguori 2011-05-31 14:04 ` Vivek Goyal 2011-05-31 14:04 ` [Qemu-devel] " Vivek Goyal 2011-05-31 14:25 ` Anthony Liguori 2011-05-31 17:59 ` Vivek Goyal 2011-05-31 17:59 ` Vivek Goyal 2011-05-31 18:39 ` Anthony Liguori 2011-05-31 18:39 ` Anthony Liguori 2011-05-31 19:24 ` Vivek Goyal 2011-05-31 19:24 ` Vivek Goyal 2011-05-31 23:30 ` Anthony Liguori 2011-06-01 13:20 ` Vivek Goyal 2011-06-01 21:15 ` Stefan Hajnoczi 2011-06-01 21:15 ` Stefan Hajnoczi 2011-06-01 21:42 ` Vivek Goyal 2011-06-01 21:42 ` Vivek Goyal 2011-06-01 22:28 ` Stefan Hajnoczi 2011-06-01 22:28 ` Stefan Hajnoczi 2011-06-04 8:54 ` Blue Swirl 2011-06-04 8:54 ` Blue Swirl 2011-05-31 20:48 ` Mike Snitzer 2011-05-31 20:48 ` [Qemu-devel] " Mike Snitzer 2011-05-31 22:22 ` Anthony Liguori 2011-05-31 13:56 ` [Qemu-devel][RFC]QEMU " Daniel P. Berrange 2011-05-31 13:56 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange 2011-05-31 14:10 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 14:10 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-05-31 14:19 ` [Qemu-devel][RFC]QEMU " Daniel P. Berrange 2011-05-31 14:19 ` [Qemu-devel] [RFC]QEMU " Daniel P. Berrange 2011-05-31 14:28 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 14:28 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-05-31 15:28 ` [Qemu-devel][RFC]QEMU " Ryan Harper 2011-05-31 15:28 ` [Qemu-devel] [RFC]QEMU " Ryan Harper 2011-05-31 19:55 ` [Qemu-devel][RFC]QEMU " Vivek Goyal 2011-05-31 19:55 ` [Qemu-devel] [RFC]QEMU " Vivek Goyal 2011-06-01 3:12 ` Zhi Yong Wu 2011-06-01 3:12 ` Zhi Yong Wu 2011-06-02 9:33 ` Michal Suchanek 2011-06-02 9:33 ` Michal Suchanek 2011-06-03 6:56 ` Zhi Yong Wu 2011-06-03 6:56 ` Zhi Yong Wu 2011-06-01 3:19 ` Zhi Yong Wu 2011-06-01 3:19 ` Zhi Yong Wu 2011-06-01 13:32 ` Vivek Goyal 2011-06-02 6:07 ` Zhi Yong Wu 2011-06-02 6:17 ` Sasha Levin 2011-06-02 6:17 ` Sasha Levin 2011-06-02 6:29 ` Zhi Yong Wu 2011-06-02 7:15 ` Sasha Levin 2011-06-02 8:18 ` Zhi Yong Wu 2011-06-02 8:18 ` Zhi Yong Wu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.