From mboxrd@z Thu Jan 1 00:00:00 1970 From: Zhi Yong Wu Subject: Re: [PATCH v2 1/1] The codes V2 for QEMU disk I/O limits. Date: Fri, 29 Jul 2011 10:09:30 +0800 Message-ID: References: <1311670746-20498-1-git-send-email-wuzhy@linux.vnet.ibm.com> <1311670746-20498-2-git-send-email-wuzhy@linux.vnet.ibm.com> <20110726192618.GA8126@amt.cnet> <20110727154913.GA6334@amt.cnet> <20110728144211.GA17754@amt.cnet> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Zhi Yong Wu , qemu-devel@nongnu.org, kvm@vger.kernel.org, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, ryanh@us.ibm.com, kwolf@redhat.com, vgoyal@redhat.com To: Marcelo Tosatti Return-path: Received: from mail-gx0-f174.google.com ([209.85.161.174]:56694 "EHLO mail-gx0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754688Ab1G2CJb convert rfc822-to-8bit (ORCPT ); Thu, 28 Jul 2011 22:09:31 -0400 Received: by gxk21 with SMTP id 21so2260582gxk.19 for ; Thu, 28 Jul 2011 19:09:30 -0700 (PDT) In-Reply-To: <20110728144211.GA17754@amt.cnet> Sender: kvm-owner@vger.kernel.org List-ID: On Thu, Jul 28, 2011 at 10:42 PM, Marcelo Tosatti = wrote: > On Thu, Jul 28, 2011 at 12:24:48PM +0800, Zhi Yong Wu wrote: >> On Wed, Jul 27, 2011 at 11:49 PM, Marcelo Tosatti wrote: >> > On Wed, Jul 27, 2011 at 06:17:15PM +0800, Zhi Yong Wu wrote: >> >> >> + =A0 =A0 =A0 =A0wait_time =3D 1; >> >> >> + =A0 =A0} >> >> >> + >> >> >> + =A0 =A0wait_time =3D wait_time + (slice_time - elapsed_time)= ; >> >> >> + =A0 =A0if (wait) { >> >> >> + =A0 =A0 =A0 =A0*wait =3D wait_time * BLOCK_IO_SLICE_TIME * 1= 0 + 1; >> >> >> + =A0 =A0} >> >> > >> >> > The guest can keep submitting requests where "wait_time =3D 1" = above, >> >> > and the timer will be rearmed continuously in the future. >> > >> > This is wrong. >> > >> >> =A0Can't you >> >> > simply arm the timer to the next slice start? _Some_ data must = be >> >> > transfered by then, anyway (and nothing can be transfered earli= er than >> >> > that). >> > >> > This is valid. >> > >> >> Sorry, i have got what you mean. Can you elaborate in more detail= ? >> > >> > Sorry, the bug i mentioned about timer being rearmed does not exis= t. >> > >> > But arming the timer for the last request as its done is confusing= /unnecessary. >> > >> > The timer processes queued requests, but the timer is armed accord= ingly >> > to the last queued request in the slice. For example, if a request= is >> > submitted 1ms before the slice ends, the timer is armed 100ms + >> > (slice_time - elapsed_time) in the future. >> >> If the timer is simply amred to the next slice start, this timer wil= l >> be a periodic timer, either the I/O rate can not be throttled under >> the limits, or the enqueued request can be delayed to handled, this >> will lower I/O rate seriously than the limits. > > Yes, periodic but disarmed when there is no queueing. I don't underst= and > your point about low I/O rate. We hope that at the same time the runtime I/O rate is throttled under this limits, it can be very close to the limits. If the timer is simpily armed to the next slice, the enqueued request could be delayed a bit long to be handled, this could make current I/O rate lowerer largely than the limits. So this could seriously hurt the I/O performance. > >> Maybe the slice time should be variable with current I/O rate. What = do >> you think of it? > > Not sure if its necessary. The slice should be small to avoid excessi= ve > work on timer context, but the advantage of increasing the slice is > very small if any. BTW, 10ms seems a better default than 100ms. Thanks for your comments, if you're interested, pls take a look at the code V3. It has added the codes for variable slice time. > > --=20 Regards, Zhi Yong Wu From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:44178) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QmcW4-0000mT-FM for qemu-devel@nongnu.org; Thu, 28 Jul 2011 22:09:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QmcW3-0002RC-Bs for qemu-devel@nongnu.org; Thu, 28 Jul 2011 22:09:32 -0400 Received: from mail-gx0-f173.google.com ([209.85.161.173]:56078) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QmcW3-0002R7-6Z for qemu-devel@nongnu.org; Thu, 28 Jul 2011 22:09:31 -0400 Received: by gxk26 with SMTP id 26so2710834gxk.4 for ; Thu, 28 Jul 2011 19:09:30 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20110728144211.GA17754@amt.cnet> References: <1311670746-20498-1-git-send-email-wuzhy@linux.vnet.ibm.com> <1311670746-20498-2-git-send-email-wuzhy@linux.vnet.ibm.com> <20110726192618.GA8126@amt.cnet> <20110727154913.GA6334@amt.cnet> <20110728144211.GA17754@amt.cnet> Date: Fri, 29 Jul 2011 10:09:30 +0800 Message-ID: From: Zhi Yong Wu Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2 1/1] The codes V2 for QEMU disk I/O limits. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: kwolf@redhat.com, aliguori@us.ibm.com, stefanha@linux.vnet.ibm.com, kvm@vger.kernel.org, Zhi Yong Wu , qemu-devel@nongnu.org, ryanh@us.ibm.com, vgoyal@redhat.com On Thu, Jul 28, 2011 at 10:42 PM, Marcelo Tosatti wro= te: > On Thu, Jul 28, 2011 at 12:24:48PM +0800, Zhi Yong Wu wrote: >> On Wed, Jul 27, 2011 at 11:49 PM, Marcelo Tosatti = wrote: >> > On Wed, Jul 27, 2011 at 06:17:15PM +0800, Zhi Yong Wu wrote: >> >> >> + =A0 =A0 =A0 =A0wait_time =3D 1; >> >> >> + =A0 =A0} >> >> >> + >> >> >> + =A0 =A0wait_time =3D wait_time + (slice_time - elapsed_time); >> >> >> + =A0 =A0if (wait) { >> >> >> + =A0 =A0 =A0 =A0*wait =3D wait_time * BLOCK_IO_SLICE_TIME * 10 + = 1; >> >> >> + =A0 =A0} >> >> > >> >> > The guest can keep submitting requests where "wait_time =3D 1" abov= e, >> >> > and the timer will be rearmed continuously in the future. >> > >> > This is wrong. >> > >> >> =A0Can't you >> >> > simply arm the timer to the next slice start? _Some_ data must be >> >> > transfered by then, anyway (and nothing can be transfered earlier t= han >> >> > that). >> > >> > This is valid. >> > >> >> Sorry, i have got what you mean. Can you elaborate in more detail? >> > >> > Sorry, the bug i mentioned about timer being rearmed does not exist. >> > >> > But arming the timer for the last request as its done is confusing/unn= ecessary. >> > >> > The timer processes queued requests, but the timer is armed accordingl= y >> > to the last queued request in the slice. For example, if a request is >> > submitted 1ms before the slice ends, the timer is armed 100ms + >> > (slice_time - elapsed_time) in the future. >> >> If the timer is simply amred to the next slice start, this timer will >> be a periodic timer, either the I/O rate can not be throttled under >> the limits, or the enqueued request can be delayed to handled, this >> will lower I/O rate seriously than the limits. > > Yes, periodic but disarmed when there is no queueing. I don't understand > your point about low I/O rate. We hope that at the same time the runtime I/O rate is throttled under this limits, it can be very close to the limits. If the timer is simpily armed to the next slice, the enqueued request could be delayed a bit long to be handled, this could make current I/O rate lowerer largely than the limits. So this could seriously hurt the I/O performance. > >> Maybe the slice time should be variable with current I/O rate. What do >> you think of it? > > Not sure if its necessary. The slice should be small to avoid excessive > work on timer context, but the advantage of increasing the slice is > very small if any. BTW, 10ms seems a better default than 100ms. Thanks for your comments, if you're interested, pls take a look at the code V3. It has added the codes for variable slice time. > > --=20 Regards, Zhi Yong Wu