From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts Date: Wed, 26 Jan 2011 11:22:37 +0200 Message-ID: <4D3FE7DD.7070603@redhat.com> References: <20110121233040.22262.68117.stgit@s20.home> <20110124093241.GA28654@amt.cnet> <4D3D89B1.30300@siemens.com> <1295883899.3230.9.camel@x201> <1295933876.3230.46.camel@x201> <4D3EA485.8030806@redhat.com> <1295967460.3230.67.camel@x201> <4D3F0448.7030702@redhat.com> <1295977396.3230.80.camel@x201> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Jan Kiszka , Marcelo Tosatti , "kvm@vger.kernel.org" , "ddutile@redhat.com" , "mst@redhat.com" , "chrisw@redhat.com" To: Alex Williamson Return-path: Received: from mx1.redhat.com ([209.132.183.28]:26781 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751422Ab1AZJWo (ORCPT ); Wed, 26 Jan 2011 04:22:44 -0500 In-Reply-To: <1295977396.3230.80.camel@x201> Sender: kvm-owner@vger.kernel.org List-ID: On 01/25/2011 07:43 PM, Alex Williamson wrote: > On Tue, 2011-01-25 at 19:11 +0200, Avi Kivity wrote: > > On 01/25/2011 04:57 PM, Alex Williamson wrote: > > > On Tue, 2011-01-25 at 12:23 +0200, Avi Kivity wrote: > > > > On 01/25/2011 07:37 AM, Alex Williamson wrote: > > > > > On Mon, 2011-01-24 at 08:44 -0700, Alex Williamson wrote: > > > > > > I'll look at how we might be > > > > > > able to allocate slots on demand. Thanks, > > > > > > > > > > Here's a first cut just to see if this looks agreeable. This allows the > > > > > slot array to grow on demand. This works with current userspace, as > > > > > well as userspace trivially modified to double KVMState.slots and > > > > > hotplugging enough pci-assign devices to exceed the previous limit (w/& > > > > > w/o ept). Hopefully I got the rcu bits correct. Does this look like > > > > > the right path? > > > > > > > > This can be trivially exhausted to pin all RAM. > > > > > > What's a reasonable upper limit? A PCI device can have at most 6 MMIO > > > BARs, each taking one slot. > > > > A BAR can take no slots, or several slots. For example a BAR might have > > a framebuffer, followed by an off-screen framebuffer, followed by an > > mmio register area in one BAR. You'd want the framebuffer to be dirty > > logged while the offscreen framebuffer is not tracked (so one slot for > > each) while the mmio range cannot be used as a slot. > > > > That only holds for emulated devices. > > Sure, emulated devices can do lots of specialty mappings, but I also > expect that more typically, mmio access to emulated devices will get > bounced through qemu and not use any slots. Right; the example I gave (a framebuffer) is the exception rather than the rule; and I believe modern framebuffers are usually accessed through dma rather than BARs. > > > It might also support MSI-X in a way that > > > splits a BAR, so an absolute max of 7 slots per PCI device. Assuming > > > only 1 PCI bus for the moment and also assuming assigned devices are > > > typically single function, 32 devices * 7 slots/device = 224 slots, so > > > maybe a 256 limit? Maybe we even bump it up to a limit of 64 devices > > > with a slot limit of 512. It would be easier in device assignment code > > > to keep track of a number of devices limit than trying to guess whether > > > slots will be available when we need them. > > > > Sounds reasonable. But we'd need a faster search for 512 slots. > > Ok, I'll add a limit. I'm not convinced the typical use case is going > to increase slots at all, so I'm still a little resistant to optimizing > the search at this point. I don't want there to be two kvm implementations out there, both supporting 512 slots, while one is slow and one is fast. It means that you have no idea what performance to expect. > BTW, simply by the ordering of when the > platform registers memory vs when other devices are mapped, we seem to > end up placing the actual memory regions at the front of the list. Not > sure if that's by design or luck. Thanks, It's was done by design, though as qemu evolves, it becomes more and more a matter of luck that the order doesn't change. -- error compiling committee.c: too many arguments to function