From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts Date: Tue, 25 Jan 2011 16:53:44 +0200 Message-ID: <4D3EE3F8.3020603@redhat.com> References: <20110121233040.22262.68117.stgit@s20.home> <20110124093241.GA28654@amt.cnet> <4D3D89B1.30300@siemens.com> <1295883899.3230.9.camel@x201> <1295933876.3230.46.camel@x201> <4D3E7D74.1030100@web.de> <1295966492.3230.55.camel@x201> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Jan Kiszka , Marcelo Tosatti , "kvm@vger.kernel.org" , "ddutile@redhat.com" , "mst@redhat.com" , "chrisw@redhat.com" To: Alex Williamson Return-path: Received: from mx1.redhat.com ([209.132.183.28]:53320 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750960Ab1AYOxt (ORCPT ); Tue, 25 Jan 2011 09:53:49 -0500 In-Reply-To: <1295966492.3230.55.camel@x201> Sender: kvm-owner@vger.kernel.org List-ID: On 01/25/2011 04:41 PM, Alex Williamson wrote: > > > > > > > > > kvm: Allow memory slot array to grow on demand > > > > > > Remove fixed KVM_MEMORY_SLOTS limit, allowing the slot array > > > to grow on demand. Private slots are now allocated at the > > > front instead of the end. Only x86 seems to use private slots, > > > > Hmm, doesn't current user space expect slots 8..11 to be the private > > ones and wouldn't it cause troubles if slots 0..3 are suddenly reserved? > > The private slots aren't currently visible to userspace, they're > actually slots 32..35. The patch automatically increments user passed > slot ids so userspace has it's own zero-based view of the array. > Frankly, I don't understand why userspace reserves slots 8..11, is this > compatibility with older kernel implementations? I think so. I believe these kernel versions are too old now to matter, but of course I can't be sure. > > > so this is now zero for all other archs. The memslots pointer > > > is already updated using rcu, so changing the size off the > > > array when it's replaces is straight forward. x86 also keeps > > > a bitmap of slots used by a kvm_mmu_page, which requires a > > > shadow tlb flush whenever we increase the number of slots. > > > This forces the pages to be rebuilt with the new bitmap size. > > > > Is it possible for user space to increase the slot number to ridiculous > > amounts (at least as far as kmalloc allows) and then trigger a kernel > > walk through them in non-preemptible contexts? Just wondering, I haven't > > checked all contexts of functions like kvm_is_visible_gfn yet. > > > > If yes, we should already switch to rbtree or something like that. > > Otherwise that may wait a bit, but probably not too long. > > Yeah, Avi has brought up the hole that userspace can exploit this > interface with these changes. However, for 99+% of users, this change > leaves the slot array at about the same size, or makes it smaller. Only > huge, scale-out guests would probably even see a doubling of slots (my > guest with 14 82576 VFs uses 48 slots). On the kernel side, I think we > can safely save a tree implementation as a later optimization should we > determine it's necessary. We'll have to see how the userspace side > matches to figure out what's best there. Thanks, > A tree would probably be a pessimization until we are able to cache the result of lookups. That's because the linear scan generates a very simple pattern of branch predictions and memory accesses, while a tree uses a whole bunch of cachelines and generates unpredictable branches (if the inputs are unpredictable). Note that with TDP most lookups result in failure, so all we need is a fast way to determine whether to perform the lookup at all or not. That can be done by caching the last lookup for this address in the spte by setting a reserved bits. For the other lookups, which we believe will succeed, we can assume the probablity of a match is related to the slot size, and sort the slots by page count. -- error compiling committee.c: too many arguments to function