From mboxrd@z Thu Jan  1 00:00:00 1970
From: Sasha Levin <levinsasha928@gmail.com>
Subject: Re: APIC lookups
Date: Sat, 03 Sep 2011 10:42:20 +0300
Message-ID: <1315035740.31676.36.camel@lappy>
References: <1314986155.31676.22.camel@lappy>
	 <20110902181323.GJ26451@redhat.com> <1314990522.31676.30.camel@lappy>
	 <20110903073208.GK26451@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: kvm <kvm@vger.kernel.org>
To: Gleb Natapov <gleb@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mail-fx0-f46.google.com ([209.85.161.46]:44487 "EHLO
	mail-fx0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751188Ab1ICHmb (ORCPT <rfc822;kvm@vger.kernel.org>);
	Sat, 3 Sep 2011 03:42:31 -0400
Received: by fxh19 with SMTP id 19so2301049fxh.19
        for <kvm@vger.kernel.org>; Sat, 03 Sep 2011 00:42:29 -0700 (PDT)
In-Reply-To: <20110903073208.GK26451@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Sat, 2011-09-03 at 10:32 +0300, Gleb Natapov wrote:
> On Fri, Sep 02, 2011 at 10:08:42PM +0300, Sasha Levin wrote:
> > On Fri, 2011-09-02 at 21:13 +0300, Gleb Natapov wrote:
> > > On Fri, Sep 02, 2011 at 08:55:55PM +0300, Sasha Levin wrote:
> > > > Hi,
> > > > 
> > > > I've noticed that kvm_irq_delivery_to_apic() is locating the destination
> > > > APIC by running through kvm_for_each_vcpu() which becomes a scalability
> > > > issue with a large number if vcpus.
> > > > 
> > > > I'm thinking about speeding that up using a radix tree for lookups, and
> > > > was wondering if it sounds right.
> > > > 
> > > We have to call kvm_apic_match_dest() on each apic to see if it should
> > > get the message. Single message can be sent to more than one apic. It is
> > > likely possible to optimize common case of physical addressing fixed
> > > destination, but then just use array of 256 elements, no need for a tree.
> > 
> > I think it's also possible to handle it for logical addressing as well,
> > instead of a simple compare we just need to go through all the IDs that
> > would 'and' with the dest.
> > 
> There are two kinds of logical addressing: flat and cluster. And
> I see nothing that prevents different CPUs be in different mode.
> 

Hm... I thought that when using logical addressing it's either flat or
cluster, not both.

In that case - yes, let's skip that.

> It is better to cache lookup result in irq routing entry to speedup
> following interrupts.
> 
> > > Do you see this function in profiling?
> > 
> > I was running profiling to see which functions get much slower during
> > regular operation (not boot) when you run with large amount of vcpus,
> > and this was one of them.
> > 
> > Though this is probably due to the method we use to find lowest priority
> > and not the lookups themselves.
> > 
> Currently we round robin between all cpus on each interrupt when lowest priority
> delivery is used. We should do it on each N interrupts where N >> 1.

I'll try that and see how it improves performance.

-- 

Sasha.