From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33058) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Umj9E-0006OV-Jf for qemu-devel@nongnu.org; Wed, 12 Jun 2013 07:23:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Umj9A-0001Rl-Qn for qemu-devel@nongnu.org; Wed, 12 Jun 2013 07:23:28 -0400 Received: from smtp.eu.citrix.com ([46.33.159.39]:12622) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Umj9A-0001RZ-HI for qemu-devel@nongnu.org; Wed, 12 Jun 2013 07:23:24 -0400 Message-ID: <1371036201.24512.413.camel@zakaz.uk.xensource.com> From: Ian Campbell Date: Wed, 12 Jun 2013 12:23:21 +0100 In-Reply-To: <51B8646702000078000DD7EA@nat28.tlf.novell.com> References: <51B1FF50.90406@eu.citrix.com> <403610A45A2B5242BD291EDAE8B37D3010E56731@SHSMSX102.ccr.corp.intel.com> <51B83E7A02000078000DD6E9@nat28.tlf.novell.com> <1371025885.24512.357.camel@zakaz.uk.xensource.com> <51B8554602000078000DD783@nat28.tlf.novell.com> <1371028940.24512.377.camel@zakaz.uk.xensource.com> <51B8646702000078000DD7EA@nat28.tlf.novell.com> Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Beulich Cc: Tim Deegan , Yongjie Ren , yanqiangjun@huawei.com, Keir Fraser , hanweidong@huawei.com, George Dunlap , Xudong Hao , Stefano Stabellini , luonengjun@huawei.com, qemu-devel@nongnu.org, wangzhenguo@huawei.com, xiaowei.yang@huawei.com, arei.gonglei@huawei.com, Paolo Bonzini , YongweiX Xu , SongtaoX Liu , "xen-devel@lists.xensource.com" On Wed, 2013-06-12 at 11:07 +0100, Jan Beulich wrote: > >>> On 12.06.13 at 11:22, Ian Campbell wrote: > > On Wed, 2013-06-12 at 10:02 +0100, Jan Beulich wrote: > >> >>> On 12.06.13 at 10:31, Ian Campbell wrote: > >> > On Wed, 2013-06-12 at 08:25 +0100, Jan Beulich wrote: > >> >> >>> On 11.06.13 at 19:26, Stefano Stabellini > >> > wrote: > >> >> > I went through the code that maps the PCI MMIO regions in hvmloader > >> >> > (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already > >> >> > maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO > >> >> > region is larger than 512MB. > >> >> > > >> >> > Maybe we could just relax this condition and map the device memory to > >> >> > high memory no matter the size of the MMIO region if the PCI bar is > >> >> > 64-bit? > >> >> > >> >> I can only recommend not to: For one, guests not using PAE or > >> >> PSE-36 can't map such space at all (and older OSes may not > >> >> properly deal with 64-bit BARs at all). And then one would generally > >> >> expect this allocation to be done top down (to minimize risk of > >> >> running into RAM), and doing so is going to present further risks of > >> >> incompatibilities with guest OSes (Linux for example learned only in > >> >> 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in > >> >> 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the > >> >> PFN to pfn_pte(), the respective parameter of which is > >> >> "unsigned long"). > >> >> > >> >> I think this ought to be done in an iterative process - if all MMIO > >> >> regions together don't fit below 4G, the biggest one should be > >> >> moved up beyond 4G first, followed by the next to biggest one > >> >> etc. > >> >> > >> >> And, just like many BIOSes have, there ought to be a guest > >> >> (config) controlled option to shrink the RAM portion below 4G > >> >> allowing more MMIO blocks to fit. > >> >> > >> >> Finally we shouldn't forget the option of not doing any assignment > >> >> at all in the BIOS, allowing/forcing the OS to use suitable address > >> >> ranges. Of course any OS is permitted to re-assign resources, but > >> >> I think they will frequently prefer to avoid re-assignment if already > >> >> done by the BIOS. > >> > > >> > Is "bios=assign-busses" on the guest command line suitable as a > >> > workaround then? Or possibly "bios=realloc" > >> > >> Which command line? Getting passed to hvmloader? > > > > I meant the guest kernel command line. > > As there's no accessible guest kernel command for HVM guests, > did you mean to require the guest admin to put something on the > command line manually? Yes, as a workaround for this shortcoming of 4.3, not as a long term solution. It's only people using passthrough with certain devices with large BARs who will ever trip over this, right? > And then - this might cover Linux, but what about other OSes, > namely Windows? True, I'm not sure if/how this can be done. The only reference I could find was at http://windows.microsoft.com/en-id/windows7/using-system-configuration which says under "Advanced boot options": PCI Lock. Prevents Windows from reallocating I/O and IRQ resources on the PCI bus. The I/O and memory resources set by the BIOS are preserved. Which seems to suggest the default is to reallocate, but I don't know. > Oh, and for Linux you confused me by using > "bios=" instead of "pci="... And "pci=realloc" only exists as of 3.0. Oops, sorry. Ian. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M Date: Wed, 12 Jun 2013 12:23:21 +0100 Message-ID: <1371036201.24512.413.camel@zakaz.uk.xensource.com> References: <51B1FF50.90406@eu.citrix.com> <403610A45A2B5242BD291EDAE8B37D3010E56731@SHSMSX102.ccr.corp.intel.com> <51B83E7A02000078000DD6E9@nat28.tlf.novell.com> <1371025885.24512.357.camel@zakaz.uk.xensource.com> <51B8554602000078000DD783@nat28.tlf.novell.com> <1371028940.24512.377.camel@zakaz.uk.xensource.com> <51B8646702000078000DD7EA@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51B8646702000078000DD7EA@nat28.tlf.novell.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org To: Jan Beulich Cc: Tim Deegan , Yongjie Ren , yanqiangjun@huawei.com, Keir Fraser , hanweidong@huawei.com, George Dunlap , Xudong Hao , Stefano Stabellini , luonengjun@huawei.com, qemu-devel@nongnu.org, wangzhenguo@huawei.com, xiaowei.yang@huawei.com, arei.gonglei@huawei.com, Paolo Bonzini , YongweiX Xu , SongtaoX Liu , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On Wed, 2013-06-12 at 11:07 +0100, Jan Beulich wrote: > >>> On 12.06.13 at 11:22, Ian Campbell wrote: > > On Wed, 2013-06-12 at 10:02 +0100, Jan Beulich wrote: > >> >>> On 12.06.13 at 10:31, Ian Campbell wrote: > >> > On Wed, 2013-06-12 at 08:25 +0100, Jan Beulich wrote: > >> >> >>> On 11.06.13 at 19:26, Stefano Stabellini > >> > wrote: > >> >> > I went through the code that maps the PCI MMIO regions in hvmloader > >> >> > (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already > >> >> > maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO > >> >> > region is larger than 512MB. > >> >> > > >> >> > Maybe we could just relax this condition and map the device memory to > >> >> > high memory no matter the size of the MMIO region if the PCI bar is > >> >> > 64-bit? > >> >> > >> >> I can only recommend not to: For one, guests not using PAE or > >> >> PSE-36 can't map such space at all (and older OSes may not > >> >> properly deal with 64-bit BARs at all). And then one would generally > >> >> expect this allocation to be done top down (to minimize risk of > >> >> running into RAM), and doing so is going to present further risks of > >> >> incompatibilities with guest OSes (Linux for example learned only in > >> >> 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in > >> >> 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the > >> >> PFN to pfn_pte(), the respective parameter of which is > >> >> "unsigned long"). > >> >> > >> >> I think this ought to be done in an iterative process - if all MMIO > >> >> regions together don't fit below 4G, the biggest one should be > >> >> moved up beyond 4G first, followed by the next to biggest one > >> >> etc. > >> >> > >> >> And, just like many BIOSes have, there ought to be a guest > >> >> (config) controlled option to shrink the RAM portion below 4G > >> >> allowing more MMIO blocks to fit. > >> >> > >> >> Finally we shouldn't forget the option of not doing any assignment > >> >> at all in the BIOS, allowing/forcing the OS to use suitable address > >> >> ranges. Of course any OS is permitted to re-assign resources, but > >> >> I think they will frequently prefer to avoid re-assignment if already > >> >> done by the BIOS. > >> > > >> > Is "bios=assign-busses" on the guest command line suitable as a > >> > workaround then? Or possibly "bios=realloc" > >> > >> Which command line? Getting passed to hvmloader? > > > > I meant the guest kernel command line. > > As there's no accessible guest kernel command for HVM guests, > did you mean to require the guest admin to put something on the > command line manually? Yes, as a workaround for this shortcoming of 4.3, not as a long term solution. It's only people using passthrough with certain devices with large BARs who will ever trip over this, right? > And then - this might cover Linux, but what about other OSes, > namely Windows? True, I'm not sure if/how this can be done. The only reference I could find was at http://windows.microsoft.com/en-id/windows7/using-system-configuration which says under "Advanced boot options": PCI Lock. Prevents Windows from reallocating I/O and IRQ resources on the PCI bus. The I/O and memory resources set by the BIOS are preserved. Which seems to suggest the default is to reallocate, but I don't know. > Oh, and for Linux you confused me by using > "bios=" instead of "pci="... And "pci=realloc" only exists as of 3.0. Oops, sorry. Ian.