Re: [Qemu-devel] [RFC]QEMU disk I/O limits

From: Vivek Goyal <vgoyal@redhat.com>
To: Anthony Liguori <anthony@codemonkey.ws>
Cc: kwolf@redhat.com, stefanha@linux.vnet.ibm.com,
	kvm@vger.kernel.org, guijianfeng@cn.fujitsu.com,
	Mike Snitzer <snitzer@redhat.com>,
	qemu-devel@nongnu.org, wuzhy@cn.ibm.com,
	herbert@gondor.hengli.com.au, Joe Thornber <ejt@redhat.com>,
	Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>,
	luowenj@cn.ibm.com, zhanx@cn.ibm.com, zhaoyang@cn.ibm.com,
	llim@redhat.com, Ryan A Harper <raharper@us.ibm.com>
Subject: Re: [Qemu-devel] [RFC]QEMU disk I/O limits
Date: Tue, 31 May 2011 15:24:34 -0400	[thread overview]
Message-ID: <20110531192434.GK16382@redhat.com> (raw)
In-Reply-To: <4DE535F3.6040400@codemonkey.ws>

On Tue, May 31, 2011 at 01:39:47PM -0500, Anthony Liguori wrote:
> On 05/31/2011 12:59 PM, Vivek Goyal wrote:
> >On Tue, May 31, 2011 at 09:25:31AM -0500, Anthony Liguori wrote:
> >>On 05/31/2011 09:04 AM, Vivek Goyal wrote:
> >>>On Tue, May 31, 2011 at 08:50:40AM -0500, Anthony Liguori wrote:
> >>>>On 05/31/2011 08:45 AM, Vivek Goyal wrote:
> >>>>>On Mon, May 30, 2011 at 01:09:23PM +0800, Zhi Yong Wu wrote:
> >>>>>>Hello, all,
> >>>>>>
> >>>>>>     I have prepared to work on a feature called "Disk I/O limits" for qemu-kvm projeect.
> >>>>>>     This feature will enable the user to cap disk I/O amount performed by a VM.It is important for some storage resources to be shared among multi-VMs. As you've known, if some of VMs are doing excessive disk I/O, they will hurt the performance of other VMs.
> >>>>>>
> >>>>>
> >>>>>Hi Zhiyong,
> >>>>>
> >>>>>Why not use kernel blkio controller for this and why reinvent the wheel
> >>>>>and implement the feature again in qemu?
> >>>>
> >>>>blkio controller only works for block devices.  It doesn't work when
> >>>>using files.
> >>>
> >>>So can't we comeup with something to easily determine which device backs
> >>>up this file? Though that will still not work for NFS backed storage
> >>>though.
> >>
> >>Right.
> >>
> >>Additionally, in QEMU, we can rate limit based on concepts that make
> >>sense to a guest.  We can limit the actual I/O ops visible to the
> >>guest which means that we'll get consistent performance regardless
> >>of whether the backing file is qcow2, raw, LVM, or raw over NFS.
> >>
> >
> >Are you referring to merging taking place which can change the definition
> >of IOPS as seen by guest?
> 
> No, with qcow2, it may take multiple real IOPs for what the guest
> sees as an IOP.
> 
> That's really the main argument I'm making here.  The only entity
> that knows what a guest IOP corresponds to is QEMU.  On the backend,
> it may end up being a network request, multiple BIOs to physical
> disks, file access, etc.

Ok, so we seem to be talking of two requirements.

- A consistent experience to guest
- Isolation between VMs.

If this qcow2 mapping/metada overhead is not significant, then we
don't have to worry about IOPs perceived by guest. It will be more or less
same. If it is significant then we provide more consistent experience to
guest but then weaken the isolation between guest and might overload the
backend storage and in turn might not get the expected IOPS for the
guest anyway.

So I think these two things are not independent.

I agree though that advantage of qemu is that everything is a file
and handling all the complex configuraitons becomes very easy.

Having said that, to provide a consistent experience to guest, you
also need to know where IO from guest is going and whether underlying
storage system can support that kind of IO or not.

IO limits are of not much use if if these are put in isolation without
knowing where IO is going and how many VMs are doing IO to it. Otherwise
there are no gurantees/estimates on minimum bandwidth for guests hence
there is no consistent experience.

Thanks
Vivek