From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Bruce Rogers" <BROGERS@novell.com>
Subject: Re: kvm scaling question
Date: Mon, 14 Sep 2009 17:19:55 -0600
Message-ID: <4AAE7B3B0200004800081118@novprvlin0050.provo.novell.com>
References: <4AAA1A0A0200004800080E06@novprvlin0050.provo.novell.com>
 <20090911215355.GD6244@amt.cnet>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 8BIT
Cc: <kvm@vger.kernel.org>
To: "Marcelo Tosatti" <mtosatti@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
In-Reply-To: <20090911215355.GD6244@amt.cnet>
Content-Disposition: inline
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

 On 9/11/2009 at 3:53 PM, Marcelo Tosatti <mtosatti@redhat.com> wrote:
> On Fri, Sep 11, 2009 at 09:36:10AM -0600, Bruce Rogers wrote:
>> I am wondering if anyone has investigated how well kvm scales when 
> supporting many guests, or many vcpus or both.
>> 
>> I'll do some investigations into the per vm memory overhead and
>> play with bumping the max vcpu limit way beyond 16, but hopefully
>> someone can comment on issues such as locking problems that are known
>> to exist and needing to be addressed to increased parallellism,
>> general overhead percentages which can help provide consolidation
>> expectations, etc.
> 
> I suppose it depends on the guest and workload. With an EPT host and
> 16-way Linux guest doing kernel compilations, on recent kernel, i see:
> 
> # Samples: 98703304
> #
> # Overhead          Command                      Shared Object  Symbol
> # ........  ...............  .................................  ......
> #
>     97.15%               sh  [kernel]                           [k] 
> vmx_vcpu_run
>      0.27%               sh  [kernel]                           [k] 
> kvm_arch_vcpu_ioctl_
>      0.12%               sh  [kernel]                           [k] 
> default_send_IPI_mas
>      0.09%               sh  [kernel]                           [k] 
> _spin_lock_irq
> 
> Which is pretty good. Without EPT/NPT the mmu_lock seems to be the major
> bottleneck to parallelism.
> 
>> Also, when I did a simple experiment with vcpu overcommitment, I was
>> surprised how quickly performance suffered (just bringing a Linux vm
>> up), since I would have assumed the additional vcpus would have been
>> halted the vast majority of the time. On a 2 proc box, overcommitment
>> to 8 vcpus in a guest (I know this isn't a good usage scenario, but
>> does provide some insights) caused the boot time to increase to almost
>> exponential levels. At 16 vcpus, it took hours to just reach the gui
>> login prompt.
> 
> One probable reason for that are vcpus which hold spinlocks in the guest
> are scheduled out in favour of vcpus which spin on that same lock.

I suspected it might be a whole lot of spinning happening. That does seems most likely. I was just surprised how bad the behavior was.

Bruce