From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Vinod, Chegu" <chegu_vinod@hp.com>
Subject: RE: [Qemu-devel] KVM call agenda for Tuesday, June 19th
Date: Thu, 12 Jul 2012 02:02:24 +0100
Message-ID: <C4FA8004DBDF864387014E5EA4383B0F580C6C5A55@GVW1336EXC.americas.hpqcorp.net>
References: <87r4tchcmo.fsf@elfo.mitica> <8762anl84w.fsf@elfo.mitica>
 <4FE08640.3060400@codemonkey.ws>
 <20120619233442.d9e64a17f9a29d9ed3faff49@gmail.com>
 <loom.20120619T172205-580@post.gmane.org> <4FFD4E76.1020005@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8BIT
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>
To: "dlaor@redhat.com" <dlaor@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from g5t0009.atlanta.hp.com ([15.192.0.46]:45507 "EHLO
	g5t0009.atlanta.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752827Ab2GLBDm convert rfc822-to-8bit (ORCPT
	<rfc822;kvm@vger.kernel.org>); Wed, 11 Jul 2012 21:03:42 -0400
In-Reply-To: <4FFD4E76.1020005@redhat.com>
Content-Language: en-US
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


-----Original Message-----
From: Dor Laor [mailto:dlaor@redhat.com] 
Sent: Wednesday, July 11, 2012 2:59 AM
To: Vinod, Chegu
Cc: kvm@vger.kernel.org
Subject: Re: [Qemu-devel] KVM call agenda for Tuesday, June 19th

On 06/19/2012 06:42 PM, Chegu Vinod wrote:
> Hello,
>
> Wanted to share some preliminary data from live migration experiments 
> on a setup that is perhaps one of the larger ones.
>
> We used Juan's "huge_memory" patches (without the separate migration 
> thread) and measured the total migration time and the time taken for stage 3 ("downtime").
> Note: We didn't change the default "downtime" (30ms?). We had a 
> private 10Gig back-to-back link between the two hosts..and we set the 
> migration speed to 10Gig.
>
> The "workloads" chosen were ones that we could easily setup. All 
> experiments were done without using virsh/virt-manager (i.e. direct 
> interaction with the qemu monitor prompt).  Pl. see the data below.
>
> As the guest size increased (and for busier the workloads) we observed 
> that network connections were getting dropped not only during the "downtime" (i.e.
> stage 3) but also during at times during iterative pre-copy phase 
> (i.e. stage 2).  Perhaps some of this will get fixed when we have the 
> migration thread implemented.
>
> We had also briefly tried the proposed delta compression changes 
> (easier to say than XBZRLE :)) on a smaller configuration. For the 
> simple workloads (perhaps there was not much temporal locality in 
> them) it didn't seem to show improvements instead took much longer 
> time to migrate (high cache miss penalty?). Waiting for the updated 
> version of the XBZRLE for further experiments to see how well it scales on this larger set up...
>
> FYI
> Vinod
>
> ---
> 10VCPUs/128G
> ---
> 1) Idle guest
> Total migration time : 124585 ms,
> Stage_3_time : 941 ms ,
> Total MB transferred : 2720
>
>
> 2) AIM7-compute (2000 users)
> Total migration time : 123540 ms,
> Stage_3_time : 726 ms ,
> Total MB transferred : 3580
>
> 3) SpecJBB (modified to run 10 warehouse threads for a long duration 
> of time) Total migration time : 165720 ms, Stage_3_time : 6851 ms , 
> Total MB transferred : 19656

6.8s downtime may be unacceptable for some applications. Does it converges with maximum downtime of 1sec?
In theory this is where post copy can shine. But what we're missing in the (good) performance data is how the application perform during live migration. This is exactly where the live migration thread and dirtybit optimization should help us.

Our 'friends' have nice old analysis of live migration performance:
  -
http://www.cl.cam.ac.uk/research/srg/netos/papers/2005-migration-nsdi-pre.pdf
  - http://www.vmware.com/files/pdf/techpaper/VMW_Netioc_BestPractices.pdf

Cheers,
Dor
>
>
>


There have been some recent fixes (from Juan) that are supposed to honor the user requested downtime. I am in the middle of redoing some of my experiments...and will share when they are ready (in about 3-4 days).  Initial observations are that the time take for the total migration considerably increases but there are no observed stalls or ping timeouts etc. Will know more after I finish my experiments (i.e. the non-XBZRLE ones).

As expected the 10G [back -to-back] connection is not really getting saturated with the migration traffic... so the there is some other layer that is consuming time (possibly the overhead of  tracking dirty pages).  

I haven't yet  had the time to try to quantify the performance degradation on the workload during the live migration (stage 2)... need to look at that next. 

Thanks for the pointers to the old artcles. 

Thanks
Vinod


> 4) Google SAT  (-s 3600 -C 5 -i 5)
> Total migration time : 411827 ms,
> Stage_3_time : 77807 ms ,
> Total MB transferred : 142136
>
>
>
> ---
> 20VCPUs /256G
> ---
>
> 1) Idle  guest
> Total migration time : 259938 ms,
> Stage_3_time : 1998 ms ,
> Total MB transferred : 5114
>
> 2) AIM7-compute (2000 users)
> Total migration time : 261336 ms,
> Stage_3_time : 2107 ms ,
> Total MB transferred : 5473
>
> 3) SpecJBB (modified to run 20 warehouse threads for a long duration 
> of time) Total migration time : 390548 ms, Stage_3_time : 19596 ms , 
> Total MB transferred : 48109
>
> 4) Google SAT  (-s 3600 -C 10 -i 10)
> Total migration time : 780150 ms,
> Stage_3_time : 90346 ms ,
> Total MB transferred : 251287
>
> ----
> 30VCPUs/384G
> ---
>
> 1) Idle guest
> (qemu) Total migration time : 501704 ms, Stage_3_time : 2835 ms , 
> Total MB transferred : 15731
>
>
> 2) AIM7-compute (2000 users)
> Total migration time : 496001 ms,
> Stage_3_time : 3884 ms ,
> Total MB transferred : 9375
>
>
> 3) SpecJBB (modified to run 30 warehouse threads for a long duration 
> of time) Total migration time : 611075 ms, Stage_3_time : 17107 ms , 
> Total MB transferred : 48862
>
>
> 4) Google SAT  (-s 3600 -C 15 -i 15)  (look at /tmp/kvm_30w_Goog) 
> Total migration time : 1348102 ms, Stage_3_time : 128531 ms , Total MB 
> transferred : 367524
>
>
>
> ---
> 40VCPUs/512G
> ---
>
> 1) Idle guest
> Total migration time : 780257 ms,
> Stage_3_time : 3770 ms ,
> Total MB transferred : 13330
>
>
> 2) AIM7-compute (2000 users)
> Total migration time : 720963 ms,
> Stage_3_time : 3966 ms ,
> Total MB transferred : 10595
>
> 3) SpecJBB (modified to run 40 warehouse threads for a long duration 
> of time) Total migration time : 863577 ms, Stage_3_time : 25149 ms , 
> Total MB transferred : 54685
>
> 4) Google SAT  (-s 3600 -C 20 -i 20)
> Total migration time : 2585039 ms,
> Stage_3_time : 177625 ms ,
> Total MB transferred : 493575
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in the 
> body of a message to majordomo@vger.kernel.org More majordomo info at  
> http://vger.kernel.org/majordomo-info.html
>