From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: [RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts Date: Tue, 25 Jan 2011 07:41:32 -0700 Message-ID: <1295966492.3230.55.camel@x201> References: <20110121233040.22262.68117.stgit@s20.home> <20110124093241.GA28654@amt.cnet> <4D3D89B1.30300@siemens.com> <1295883899.3230.9.camel@x201> <1295933876.3230.46.camel@x201> <4D3E7D74.1030100@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Marcelo Tosatti , "kvm@vger.kernel.org" , "ddutile@redhat.com" , "mst@redhat.com" , "avi@redhat.com" , "chrisw@redhat.com" To: Jan Kiszka Return-path: Received: from mx1.redhat.com ([209.132.183.28]:31975 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751977Ab1AYOle (ORCPT ); Tue, 25 Jan 2011 09:41:34 -0500 In-Reply-To: <4D3E7D74.1030100@web.de> Sender: kvm-owner@vger.kernel.org List-ID: On Tue, 2011-01-25 at 08:36 +0100, Jan Kiszka wrote: > On 2011-01-25 06:37, Alex Williamson wrote: > > On Mon, 2011-01-24 at 08:44 -0700, Alex Williamson wrote: > >> I'll look at how we might be > >> able to allocate slots on demand. Thanks, > > > > Here's a first cut just to see if this looks agreeable. This allows the > > slot array to grow on demand. This works with current userspace, as > > well as userspace trivially modified to double KVMState.slots and > > hotplugging enough pci-assign devices to exceed the previous limit (w/ & > > w/o ept). Hopefully I got the rcu bits correct. Does this look like > > the right path? If so, I'll work on removing the fixed limit from > > userspace next. Thanks, > > > > Alex > > > > > > kvm: Allow memory slot array to grow on demand > > > > Remove fixed KVM_MEMORY_SLOTS limit, allowing the slot array > > to grow on demand. Private slots are now allocated at the > > front instead of the end. Only x86 seems to use private slots, > > Hmm, doesn't current user space expect slots 8..11 to be the private > ones and wouldn't it cause troubles if slots 0..3 are suddenly reserved? The private slots aren't currently visible to userspace, they're actually slots 32..35. The patch automatically increments user passed slot ids so userspace has it's own zero-based view of the array. Frankly, I don't understand why userspace reserves slots 8..11, is this compatibility with older kernel implementations? > > so this is now zero for all other archs. The memslots pointer > > is already updated using rcu, so changing the size off the > > array when it's replaces is straight forward. x86 also keeps > > a bitmap of slots used by a kvm_mmu_page, which requires a > > shadow tlb flush whenever we increase the number of slots. > > This forces the pages to be rebuilt with the new bitmap size. > > Is it possible for user space to increase the slot number to ridiculous > amounts (at least as far as kmalloc allows) and then trigger a kernel > walk through them in non-preemptible contexts? Just wondering, I haven't > checked all contexts of functions like kvm_is_visible_gfn yet. > > If yes, we should already switch to rbtree or something like that. > Otherwise that may wait a bit, but probably not too long. Yeah, Avi has brought up the hole that userspace can exploit this interface with these changes. However, for 99+% of users, this change leaves the slot array at about the same size, or makes it smaller. Only huge, scale-out guests would probably even see a doubling of slots (my guest with 14 82576 VFs uses 48 slots). On the kernel side, I think we can safely save a tree implementation as a later optimization should we determine it's necessary. We'll have to see how the userspace side matches to figure out what's best there. Thanks, Alex