From mboxrd@z Thu Jan 1 00:00:00 1970 From: Anthony Liguori Subject: Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state Date: Tue, 18 Jan 2011 10:37:34 -0600 Message-ID: <4D35C1CE.10509@linux.vnet.ibm.com> References: <4D2B6CB5.9050602@codemonkey.ws> <4D2B74D8.4080309@web.de> <4D2B8662.9060909@web.de> <4D2C60FB.7030009@linux.vnet.ibm.com> <4D2D80ED.8030405@redhat.com> <4D2D82EE.20002@siemens.com> <4D35A39A.8000801@siemens.com> <4D35ABF8.9050700@linux.vnet.ibm.com> <4D35B521.3090601@siemens.com> <4D35B6DD.1020005@linux.vnet.ibm.com> <4D35B963.7000605@siemens.com> <4D35BA22.7060602@linux.vnet.ibm.com> <4D35BD30.1060900@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Avi Kivity , Markus Armbruster , Marcelo Tosatti , Glauber Costa , "kvm@vger.kernel.org" , "qemu-devel@nongnu.org" To: Jan Kiszka Return-path: Received: from e3.ny.us.ibm.com ([32.97.182.143]:60395 "EHLO e3.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752178Ab1ARQhl (ORCPT ); Tue, 18 Jan 2011 11:37:41 -0500 Received: from d01dlp02.pok.ibm.com (d01dlp02.pok.ibm.com [9.56.224.85]) by e3.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0IGI5iV013121 for ; Tue, 18 Jan 2011 11:18:41 -0500 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp02.pok.ibm.com (Postfix) with ESMTP id 0A9454DE803E for ; Tue, 18 Jan 2011 11:34:22 -0500 (EST) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0IGbadG093892 for ; Tue, 18 Jan 2011 11:37:37 -0500 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0IGbZob007833 for ; Tue, 18 Jan 2011 09:37:36 -0700 In-Reply-To: <4D35BD30.1060900@siemens.com> Sender: kvm-owner@vger.kernel.org List-ID: On 01/18/2011 10:17 AM, Jan Kiszka wrote: > On 2011-01-18 17:04, Anthony Liguori wrote: > >>>>>> A KVM device should sit on a KVM specific bus that hangs off of >>>>>> sysbus. >>>>>> It can get to kvm_state through that bus. >>>>>> >>>>>> That bus doesn't get instantiated through qdev so requiring a pointer >>>>>> argument should not be an issue. >>>>>> >>>>>> >>>>>> >>>>>> >>>>> This design is in conflict with the requirement to attach KVM-assisted >>>>> devices also to their home bus, e.g. an assigned PCI device to the PCI >>>>> bus. We don't support multi-homed qdev devices. >>>>> >>>>> >>>>> >>>> The bus topology reflects how I/O flows in and out of a device. We do >>>> not model a perfect PC bus architecture and I don't think we ever intend >>>> to. Instead, we model a functional architecture. >>>> >>>> I/O from an assigned device does not flow through the emulated PCI bus. >>>> Therefore, it does not belong on the emulated PCI bus. >>>> >>>> Assigned devices need to interact with the emulated PCI bus, but they >>>> shouldn't be children of it. >>>> >>>> >>> You should be able to find assigned devices on some PCI bus, so you >>> either have to hack up the existing bus to host devices that are, on the >>> other side, not part of it or branch off a pci-kvm sub-bus, just like >>> you would have to create a sysbus-kvm. >>> >> Management tools should never transverse the device tree to find >> devices. This is a recipe for disaster in the long term because the >> device tree will not remain stable. >> >> So yes, a management tool should be able to enumerate assigned devices >> as they would enumerate any other PCI device but that has almost nothing >> to do with what the tree layout is. >> > I'm probably misunderstanding you, but if the bus topology as the guest > sees it is not properly reflected in an object tree on the qemu side, we > are creating hacks again. > There is no such thing as the "bus topology as the guest sees it". The guest just sees a bunch of devices. The guest can only infer things like ISA busses. The guest sees a bunch of devices: an i8254, i8259, RTC, etc. Whether those devices are on an ISA bus, and LPC bus, or all in a SuperI/O chip that's part of the southbridge is all invisible to the guest. The device model topology is 100% a hidden architectural detail. > Management and analysis tools must be able to traverse the system buses > and find guest devices this way. We need to provide a compatible interface to the guest. If you agree with my above statements, then you'll also agree that we can do this without keeping the device model topology stable. But we also need to provide a compatible interface to management tools. Exposing the device model topology as a compatible interface artificially limits us. It's far better to provide higher level supported interfaces to give us the flexibility to change the device model as we need to. > If they create a device on bus X, it > must never end up on bus Y just because it happens to be KVM-assisted or > has some other property. Nope. This is exactly what should happen. 90% of the devices in the device model are not created by management tools. They're part of a chipset. The chipset has well defined extension points and we provide management interfaces to create devices on those extension points. That is, interfaces to create PCI devices. Regards, Anthony Liguori > On the other hand, trying to hide this > dependency will likely cause severe damage to the qdev design. > > Jan > > From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=45038 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PfEkZ-0005dU-0r for qemu-devel@nongnu.org; Tue, 18 Jan 2011 11:50:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PfEkD-00009h-Nq for qemu-devel@nongnu.org; Tue, 18 Jan 2011 11:49:42 -0500 Received: from e4.ny.us.ibm.com ([32.97.182.144]:48317) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PfEkD-00009E-L1 for qemu-devel@nongnu.org; Tue, 18 Jan 2011 11:49:21 -0500 Received: from d01dlp01.pok.ibm.com (d01dlp01.pok.ibm.com [9.56.224.56]) by e4.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p0IGTF01030289 for ; Tue, 18 Jan 2011 11:31:15 -0500 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 8E731728554 for ; Tue, 18 Jan 2011 11:37:36 -0500 (EST) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p0IGbaw7430412 for ; Tue, 18 Jan 2011 11:37:36 -0500 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p0IGbZoV007833 for ; Tue, 18 Jan 2011 09:37:35 -0700 Message-ID: <4D35C1CE.10509@linux.vnet.ibm.com> Date: Tue, 18 Jan 2011 10:37:34 -0600 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 28/35] kvm: x86: Introduce kvmclock device to save/restore its state References: <4D2B6CB5.9050602@codemonkey.ws> <4D2B74D8.4080309@web.de> <4D2B8662.9060909@web.de> <4D2C60FB.7030009@linux.vnet.ibm.com> <4D2D80ED.8030405@redhat.com> <4D2D82EE.20002@siemens.com> <4D35A39A.8000801@siemens.com> <4D35ABF8.9050700@linux.vnet.ibm.com> <4D35B521.3090601@siemens.com> <4D35B6DD.1020005@linux.vnet.ibm.com> <4D35B963.7000605@siemens.com> <4D35BA22.7060602@linux.vnet.ibm.com> <4D35BD30.1060900@siemens.com> In-Reply-To: <4D35BD30.1060900@siemens.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: "kvm@vger.kernel.org" , Glauber Costa , Marcelo Tosatti , "qemu-devel@nongnu.org" , Markus Armbruster , Avi Kivity On 01/18/2011 10:17 AM, Jan Kiszka wrote: > On 2011-01-18 17:04, Anthony Liguori wrote: > >>>>>> A KVM device should sit on a KVM specific bus that hangs off of >>>>>> sysbus. >>>>>> It can get to kvm_state through that bus. >>>>>> >>>>>> That bus doesn't get instantiated through qdev so requiring a pointer >>>>>> argument should not be an issue. >>>>>> >>>>>> >>>>>> >>>>>> >>>>> This design is in conflict with the requirement to attach KVM-assisted >>>>> devices also to their home bus, e.g. an assigned PCI device to the PCI >>>>> bus. We don't support multi-homed qdev devices. >>>>> >>>>> >>>>> >>>> The bus topology reflects how I/O flows in and out of a device. We do >>>> not model a perfect PC bus architecture and I don't think we ever intend >>>> to. Instead, we model a functional architecture. >>>> >>>> I/O from an assigned device does not flow through the emulated PCI bus. >>>> Therefore, it does not belong on the emulated PCI bus. >>>> >>>> Assigned devices need to interact with the emulated PCI bus, but they >>>> shouldn't be children of it. >>>> >>>> >>> You should be able to find assigned devices on some PCI bus, so you >>> either have to hack up the existing bus to host devices that are, on the >>> other side, not part of it or branch off a pci-kvm sub-bus, just like >>> you would have to create a sysbus-kvm. >>> >> Management tools should never transverse the device tree to find >> devices. This is a recipe for disaster in the long term because the >> device tree will not remain stable. >> >> So yes, a management tool should be able to enumerate assigned devices >> as they would enumerate any other PCI device but that has almost nothing >> to do with what the tree layout is. >> > I'm probably misunderstanding you, but if the bus topology as the guest > sees it is not properly reflected in an object tree on the qemu side, we > are creating hacks again. > There is no such thing as the "bus topology as the guest sees it". The guest just sees a bunch of devices. The guest can only infer things like ISA busses. The guest sees a bunch of devices: an i8254, i8259, RTC, etc. Whether those devices are on an ISA bus, and LPC bus, or all in a SuperI/O chip that's part of the southbridge is all invisible to the guest. The device model topology is 100% a hidden architectural detail. > Management and analysis tools must be able to traverse the system buses > and find guest devices this way. We need to provide a compatible interface to the guest. If you agree with my above statements, then you'll also agree that we can do this without keeping the device model topology stable. But we also need to provide a compatible interface to management tools. Exposing the device model topology as a compatible interface artificially limits us. It's far better to provide higher level supported interfaces to give us the flexibility to change the device model as we need to. > If they create a device on bus X, it > must never end up on bus Y just because it happens to be KVM-assisted or > has some other property. Nope. This is exactly what should happen. 90% of the devices in the device model are not created by management tools. They're part of a chipset. The chipset has well defined extension points and we provide management interfaces to create devices on those extension points. That is, interfaces to create PCI devices. Regards, Anthony Liguori > On the other hand, trying to hide this > dependency will likely cause severe damage to the qdev design. > > Jan > >