From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:48487) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ubri2-0001y8-1z for qemu-devel@nongnu.org; Mon, 13 May 2013 08:18:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Ubri0-0003I5-CU for qemu-devel@nongnu.org; Mon, 13 May 2013 08:18:29 -0400 Received: from mail-oa0-f54.google.com ([209.85.219.54]:47272) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Ubri0-0003G4-62 for qemu-devel@nongnu.org; Mon, 13 May 2013 08:18:28 -0400 Received: by mail-oa0-f54.google.com with SMTP id o17so2870955oag.13 for ; Mon, 13 May 2013 05:18:27 -0700 (PDT) From: Anthony Liguori In-Reply-To: <518FCF30.8040908@redhat.com> References: <1368128600-30721-1-git-send-email-chegu_vinod@hp.com> <1368128600-30721-4-git-send-email-chegu_vinod@hp.com> <87y5bnc7a0.fsf@codemonkey.ws> <518D00B6.6040305@hp.com> <87ip2qzx7g.fsf@codemonkey.ws> <518FCF30.8040908@redhat.com> Date: Mon, 13 May 2013 07:18:15 -0500 Message-ID: <87obcfnke0.fsf@codemonkey.ws> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: Re: [Qemu-devel] [RFC PATCH v5 3/3] Force auto-convegence of live migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: owasserm@redhat.com, Chegu Vinod , qemu-devel@nongnu.org, quintela@redhat.com Paolo Bonzini writes: > Il 10/05/2013 17:11, Anthony Liguori ha scritto: >> Chegu Vinod writes: >> >>> On 5/10/2013 6:07 AM, Anthony Liguori wrote: >>>> Chegu Vinod writes: >>>> >>>>> If a user chooses to turn on the auto-converge migration capability >>>>> these changes detect the lack of convergence and throttle down the >>>>> guest. i.e. force the VCPUs out of the guest for some duration >>>>> and let the migration thread catchup and help converge. >>>>> >>>>> Verified the convergence using the following : >>>>> - SpecJbb2005 workload running on a 20VCPU/256G guest(~80% busy) >>>>> - OLTP like workload running on a 80VCPU/512G guest (~80% busy) >>>>> >>>>> Sample results with SpecJbb2005 workload : (migrate speed set to 20Gb and >>>>> migrate downtime set to 4seconds). >>>> Would it make sense to separate out the "slow the VCPU down" part of >>>> this? >>>> >>>> That would give a management tool more flexibility to create policies >>>> around slowing the VCPU down to encourage migration. >>> >>> I believe one can always enhance libvirt tools to monitor the migration >>> statistics and control the shares/entitlements of the vcpus via >>> cgroups..thereby slowing the guest down to allow for convergence (I had >>> that listed in my earlier versions of the patches as an option and also >>> noted that it requires external (i.e. tool driven) monitoring and >>> triggers...and that this alternative was kind of automatic after the >>> initial setting of the capability). >>> >>> Is that what you meant by your comment above (or) are you talking about >>> something outside the scope of cgroups and from an implementation point >>> of view also outside the migration code path...i.e. a new knob that an >>> external tool can use to just throttle down the vcpus of a guest ? >> >> I'm saying, a knob to throttle the guest vcpus within QEMU that could be >> used by management tools to encourage convergence. >> >> For instance, consider an imaginary "vcpu_throttle" command that took a >> number between 0 and 1 that throttled VCPU performance accordingly. >> >> Then migration would look like: >> >> 0) throttle = 1.0 >> 1) call migrate command to start migration >> 2) query progress until you decide you aren't converging >> 3) throttle *= 0.75; call vcpu_throttle $throttle >> 4) goto (2) >> >> Now I'm not opposed to a series like this that adds this sort of policy >> to QEMU itself too but I want to make sure the pieces are exposed for a >> management tool to implement its own policies too. > > Note that QEMU can also throttle VCPUs as they dirty guest memory, > rather than based on CPU time. That's not something that management > cannot do (you can approximate it based on the recent history if you > provide dirtying statistics, but it's not the same thing). Sure but in that case, I'd argue you would want to expose that as a command that libvirt could invoke too. Regards, Anthony Liguori > > Paolo