From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39299) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Umhw5-0002z0-AN for qemu-devel@nongnu.org; Wed, 12 Jun 2013 06:05:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Umhvx-0007K9-0T for qemu-devel@nongnu.org; Wed, 12 Jun 2013 06:05:49 -0400 Received: from smtp.citrix.com ([66.165.176.89]:57721) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Umhvw-0007Jn-S0 for qemu-devel@nongnu.org; Wed, 12 Jun 2013 06:05:40 -0400 Message-ID: <51B847E3.5010604@eu.citrix.com> Date: Wed, 12 Jun 2013 11:05:23 +0100 From: George Dunlap MIME-Version: 1.0 References: <51B1FF50.90406@eu.citrix.com> <403610A45A2B5242BD291EDAE8B37D3010E56731@SHSMSX102.ccr.corp.intel.com> <51B83E7A02000078000DD6E9@nat28.tlf.novell.com> In-Reply-To: <51B83E7A02000078000DD6E9@nat28.tlf.novell.com> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Beulich Cc: Tim Deegan , Yongjie Ren , yanqiangjun@huawei.com, Keir Fraser , Ian Campbell , hanweidong@huawei.com, Xudong Hao , Stefano Stabellini , luonengjun@huawei.com, qemu-devel@nongnu.org, wangzhenguo@huawei.com, xiaowei.yang@huawei.com, arei.gonglei@huawei.com, Paolo Bonzini , YongweiX Xu , SongtaoX Liu , "xen-devel@lists.xensource.com" On 12/06/13 08:25, Jan Beulich wrote: >>>> On 11.06.13 at 19:26, Stefano Stabellini wrote: >> I went through the code that maps the PCI MMIO regions in hvmloader >> (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already >> maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO >> region is larger than 512MB. >> >> Maybe we could just relax this condition and map the device memory to >> high memory no matter the size of the MMIO region if the PCI bar is >> 64-bit? > I can only recommend not to: For one, guests not using PAE or > PSE-36 can't map such space at all (and older OSes may not > properly deal with 64-bit BARs at all). And then one would generally > expect this allocation to be done top down (to minimize risk of > running into RAM), and doing so is going to present further risks of > incompatibilities with guest OSes (Linux for example learned only in > 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in > 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the > PFN to pfn_pte(), the respective parameter of which is > "unsigned long"). > > I think this ought to be done in an iterative process - if all MMIO > regions together don't fit below 4G, the biggest one should be > moved up beyond 4G first, followed by the next to biggest one > etc. First of all, the proposal to move the PCI BAR up to the 64-bit range is a temporary work-around. It should only be done if a device doesn't fit in the current MMIO range. We have three options here: 1. Don't do anything 2. Have hvmloader move PCI devices up to the 64-bit MMIO hole if they don't fit 3. Convince qemu to allow MMIO regions to mask memory (or what it thinks is memory). 4. Add a mechanism to tell qemu that memory is being relocated. Number 4 is definitely the right answer long-term, but we just don't have time to do that before the 4.3 release. We're not sure yet if #3 is possible; even if it is, it may have unpredictable knock-on effects. Doing #2, it is true that many guests will be unable to access the device because of 32-bit limitations. However, in #1, *no* guests will be able to access the device. At least in #2, *many* guests will be able to do so. In any case, apparently #2 is what KVM does, so having the limitation on guests is not without precedent. It's also likely to be a somewhat tested configuration (unlike #3, for example). -George From mboxrd@z Thu Jan 1 00:00:00 1970 From: George Dunlap Subject: Re: [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M Date: Wed, 12 Jun 2013 11:05:23 +0100 Message-ID: <51B847E3.5010604@eu.citrix.com> References: <51B1FF50.90406@eu.citrix.com> <403610A45A2B5242BD291EDAE8B37D3010E56731@SHSMSX102.ccr.corp.intel.com> <51B83E7A02000078000DD6E9@nat28.tlf.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <51B83E7A02000078000DD6E9@nat28.tlf.novell.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org Sender: qemu-devel-bounces+gceq-qemu-devel=gmane.org@nongnu.org To: Jan Beulich Cc: Tim Deegan , Yongjie Ren , yanqiangjun@huawei.com, Keir Fraser , Ian Campbell , hanweidong@huawei.com, Xudong Hao , Stefano Stabellini , luonengjun@huawei.com, qemu-devel@nongnu.org, wangzhenguo@huawei.com, xiaowei.yang@huawei.com, arei.gonglei@huawei.com, Paolo Bonzini , YongweiX Xu , SongtaoX Liu , "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On 12/06/13 08:25, Jan Beulich wrote: >>>> On 11.06.13 at 19:26, Stefano Stabellini wrote: >> I went through the code that maps the PCI MMIO regions in hvmloader >> (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already >> maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO >> region is larger than 512MB. >> >> Maybe we could just relax this condition and map the device memory to >> high memory no matter the size of the MMIO region if the PCI bar is >> 64-bit? > I can only recommend not to: For one, guests not using PAE or > PSE-36 can't map such space at all (and older OSes may not > properly deal with 64-bit BARs at all). And then one would generally > expect this allocation to be done top down (to minimize risk of > running into RAM), and doing so is going to present further risks of > incompatibilities with guest OSes (Linux for example learned only in > 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in > 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the > PFN to pfn_pte(), the respective parameter of which is > "unsigned long"). > > I think this ought to be done in an iterative process - if all MMIO > regions together don't fit below 4G, the biggest one should be > moved up beyond 4G first, followed by the next to biggest one > etc. First of all, the proposal to move the PCI BAR up to the 64-bit range is a temporary work-around. It should only be done if a device doesn't fit in the current MMIO range. We have three options here: 1. Don't do anything 2. Have hvmloader move PCI devices up to the 64-bit MMIO hole if they don't fit 3. Convince qemu to allow MMIO regions to mask memory (or what it thinks is memory). 4. Add a mechanism to tell qemu that memory is being relocated. Number 4 is definitely the right answer long-term, but we just don't have time to do that before the 4.3 release. We're not sure yet if #3 is possible; even if it is, it may have unpredictable knock-on effects. Doing #2, it is true that many guests will be unable to access the device because of 32-bit limitations. However, in #1, *no* guests will be able to access the device. At least in #2, *many* guests will be able to do so. In any case, apparently #2 is what KVM does, so having the limitation on guests is not without precedent. It's also likely to be a somewhat tested configuration (unlike #3, for example). -George