From mboxrd@z Thu Jan  1 00:00:00 1970
From: Marcelo Tosatti <mtosatti@redhat.com>
Subject: Re: kvm scaling question
Date: Fri, 11 Sep 2009 18:53:55 -0300
Message-ID: <20090911215355.GD6244@amt.cnet>
References: <4AAA1A0A0200004800080E06@novprvlin0050.provo.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org
To: Bruce Rogers <brogers@novell.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:3302 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1757031AbZIKV4v (ORCPT <rfc822;kvm@vger.kernel.org>);
	Fri, 11 Sep 2009 17:56:51 -0400
Content-Disposition: inline
In-Reply-To: <4AAA1A0A0200004800080E06@novprvlin0050.provo.novell.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Fri, Sep 11, 2009 at 09:36:10AM -0600, Bruce Rogers wrote:
> I am wondering if anyone has investigated how well kvm scales when supporting many guests, or many vcpus or both.
> 
> I'll do some investigations into the per vm memory overhead and
> play with bumping the max vcpu limit way beyond 16, but hopefully
> someone can comment on issues such as locking problems that are known
> to exist and needing to be addressed to increased parallellism,
> general overhead percentages which can help provide consolidation
> expectations, etc.

I suppose it depends on the guest and workload. With an EPT host and
16-way Linux guest doing kernel compilations, on recent kernel, i see:

# Samples: 98703304
#
# Overhead          Command                      Shared Object  Symbol
# ........  ...............  .................................  ......
#
    97.15%               sh  [kernel]                           [k] vmx_vcpu_run
     0.27%               sh  [kernel]                           [k] kvm_arch_vcpu_ioctl_
     0.12%               sh  [kernel]                           [k] default_send_IPI_mas
     0.09%               sh  [kernel]                           [k] _spin_lock_irq

Which is pretty good. Without EPT/NPT the mmu_lock seems to be the major
bottleneck to parallelism.

> Also, when I did a simple experiment with vcpu overcommitment, I was
> surprised how quickly performance suffered (just bringing a Linux vm
> up), since I would have assumed the additional vcpus would have been
> halted the vast majority of the time. On a 2 proc box, overcommitment
> to 8 vcpus in a guest (I know this isn't a good usage scenario, but
> does provide some insights) caused the boot time to increase to almost
> exponential levels. At 16 vcpus, it took hours to just reach the gui
> login prompt.

One probable reason for that are vcpus which hold spinlocks in the guest
are scheduled out in favour of vcpus which spin on that same lock.

> Any perspective you can offer would be appreciated.
> 
> Bruce