From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments Date: Mon, 10 Jan 2011 14:23:57 -0600 Message-ID: <4D2B6ADD.4090505@codemonkey.ws> References: <4D2616D6.4080309@linux.vnet.ibm.com> <4D26D6CF.5070405@web.de> <4D27A16F.9030809@linux.vnet.ibm.com> <4D282489.90506@web.de> <4D2B6506.6070907@linux.vnet.ibm.com> <4D2B6845.7050809@web.de> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Anthony Liguori , Marcelo Tosatti , qemu-devel@nongnu.org, kvm@vger.kernel.org, Alexander Graf To: Jan Kiszka Return-path: Received: from mail-iw0-f174.google.com ([209.85.214.174]:56250 "EHLO mail-iw0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752749Ab1AJUYj (ORCPT ); Mon, 10 Jan 2011 15:24:39 -0500 Received: by iwn9 with SMTP id 9so19524074iwn.19 for ; Mon, 10 Jan 2011 12:24:38 -0800 (PST) In-Reply-To: <4D2B6845.7050809@web.de> Sender: kvm-owner@vger.kernel.org List-ID: On 01/10/2011 02:12 PM, Jan Kiszka wrote: > Am 10.01.2011 20:59, Anthony Liguori wrote: > >> On 01/08/2011 02:47 AM, Jan Kiszka wrote: >> >>> Am 08.01.2011 00:27, Anthony Liguori wrote: >>> >>> >>>> On 01/07/2011 03:03 AM, Jan Kiszka wrote: >>>> >>>> >>>>> Am 06.01.2011 20:24, Anthony Liguori wrote: >>>>> >>>>> >>>>> >>>>>> On 01/06/2011 11:56 AM, Marcelo Tosatti wrote: >>>>>> >>>>>> >>>>>> >>>>>>> From: Jan Kiszka >>>>>>> >>>>>>> QEMU supports only one VM, so there is only one kvm_state per >>>>>>> process, >>>>>>> and we gain nothing passing a reference to it around. Eliminate any >>>>>>> need >>>>>>> to refer to it outside of kvm-all.c. >>>>>>> >>>>>>> Signed-off-by: Jan Kiszka >>>>>>> CC: Alexander Graf >>>>>>> Signed-off-by: Marcelo Tosatti >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> I think this is a big mistake. >>>>>> >>>>>> >>>>>> >>>>> Obviously, I don't share your concerns. :) >>>>> >>>>> >>>>> >>>>> >>>>>> Having to manage kvm_state keeps the abstraction lines well defined. >>>>>> >>>>>> >>>>>> >>>>> How does it help? >>>>> >>>>> >>>>> >>>>> >>>>>> Otherwise, it's far too easy for portions of code to call into KVM >>>>>> functions that really shouldn't. >>>>>> >>>>>> >>>>>> >>>>> I can't imagine we gain anything from requiring kvm_check_extension >>>>> callers to hold a kvm_state "capability". Yes, it's now much easier to >>>>> call kvm_[vm_]ioctl, but that's the key point of this change: >>>>> >>>>> So far we primarily complicated the internal interface between generic >>>>> and arch-dependent kvm parts by requiring kvm_state joggling. But >>>>> external users already find interfaces without this restriction >>>>> (kvm_log_*, kvm_ioeventfd_*, ...). That's because it's at least >>>>> complicated to _cleanly_ pass kvm_state references to all users that >>>>> need it - e.g. sysbus devices like kvmclock or upcoming in-kernel >>>>> irqchips. >>>>> >>>>> >>>>> >>>> I think you're basically making my point for me. >>>> >>>> ioeventfd is a broken interface. It shouldn't be a VM ioctl but rather >>>> a VCPU ioctl because PIO events are dispatched on a per-VCPU basis. >>>> >>>> >>> OK, but I don't want to argue about the ioeventfd API. So let's put this >>> case aside. :) >>> >>> >>> >>>> kvm_state is available as part of CPU state so it's quite easy to get at >>>> if these interfaces just took a CPUState argument (and they should). >>>> >>>> >>> My point is definitely NOT about cpu-bound devices. That case is clear >>> and is not touched at all by this patch. >>> >>> My point is about devices that have clear system scope like kvmclock, >>> ioapic, pit, pic, >>> >> I don't see how ioapic, pit, or pic have a system scope. >> > They are not bound to any CPU like the APIC which you may have in mind. > And none of the above interact with KVM. They may be replaced by KVM but if you look at the PIT, this is done by having two distinct devices. The KVM specific device can (and should) be instantiated with kvm_state. The way the IOAPIC/APIC/PIC is handled in qemu-kvm is nasty. The kernel devices are separate devices and that should be reflected in the device tree. >> I don't know enough about kvmclock. >> > It's just the same. > > >> >>> whatever-the-future-will-bring. And about KVM services >>> that have global scope like capability checks and other feature >>> explorations or VM configurations done by the KVM arch code. You still >>> didn't explain what we gain in these concrete scenarios by handing the >>> technically redundant abstraction kvm_state around, especially _inside_ >>> the KVM core. >>> >>> >> If you have to pass around a KVMState pointer, you establish an explicit >> relationship and communication between subsystems. Any place where the >> global KVMState is used is a red flag that something is wrong. >> > It is and will be _only_ used inside kvm-all.c. Again: What is the > benefit of restricting access to kvm_check_extension this way? > The more places that need to deal with KVM compatibility code, the worse we will be because it's more opportunities to get it wrong. >> I don't see what the advantage to making all of the KVMState global and >> implicit. It seems like a big step backwards to me. Can you give a >> very concrete example of where you think it results in easier to >> understand code as I don't see how making relationships implicit ever >> makes code easier to understand? >> > The best example does not yet exist (fortunately): Just look at patch 28 > and then try to pass some kvm_state reference to the kvmclock device. Is > this handle worth changing the sysbus API? > Let me look at that patch and reply there. Regards, Anthony Liguori > Jan > >