All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Tim Deegan <tim@xen.org>, Yongjie Ren <yongjie.ren@intel.com>,
	yanqiangjun@huawei.com, Keir Fraser <keir@xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	hanweidong@huawei.com, Xudong Hao <xudong.hao@intel.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	luonengjun@huawei.com, qemu-devel@nongnu.org,
	wangzhenguo@huawei.com, xiaowei.yang@huawei.com,
	arei.gonglei@huawei.com, Paolo Bonzini <pbonzini@redhat.com>,
	YongweiX Xu <yongweix.xu@intel.com>,
	SongtaoX Liu <songtaox.liu@intel.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [Qemu-devel] [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M
Date: Wed, 12 Jun 2013 11:11:31 +0100	[thread overview]
Message-ID: <51B8657302000078000DD7FD@nat28.tlf.novell.com> (raw)
In-Reply-To: <51B847E3.5010604@eu.citrix.com>

>>> On 12.06.13 at 12:05, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> On 12/06/13 08:25, Jan Beulich wrote:
>>>>> On 11.06.13 at 19:26, Stefano Stabellini <stefano.stabellini@eu.citrix.com> 
> wrote:
>>> I went through the code that maps the PCI MMIO regions in hvmloader
>>> (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already
>>> maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO
>>> region is larger than 512MB.
>>>
>>> Maybe we could just relax this condition and map the device memory to
>>> high memory no matter the size of the MMIO region if the PCI bar is
>>> 64-bit?
>> I can only recommend not to: For one, guests not using PAE or
>> PSE-36 can't map such space at all (and older OSes may not
>> properly deal with 64-bit BARs at all). And then one would generally
>> expect this allocation to be done top down (to minimize risk of
>> running into RAM), and doing so is going to present further risks of
>> incompatibilities with guest OSes (Linux for example learned only in
>> 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in
>> 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the
>> PFN to pfn_pte(), the respective parameter of which is
>> "unsigned long").
>>
>> I think this ought to be done in an iterative process - if all MMIO
>> regions together don't fit below 4G, the biggest one should be
>> moved up beyond 4G first, followed by the next to biggest one
>> etc.
> 
> First of all, the proposal to move the PCI BAR up to the 64-bit range is 
> a temporary work-around.  It should only be done if a device doesn't fit 
> in the current MMIO range.
> 
> We have three options here:
> 1. Don't do anything
> 2. Have hvmloader move PCI devices up to the 64-bit MMIO hole if they 
> don't fit
> 3. Convince qemu to allow MMIO regions to mask memory (or what it thinks 
> is memory).
> 4. Add a mechanism to tell qemu that memory is being relocated.
> 
> Number 4 is definitely the right answer long-term, but we just don't 
> have time to do that before the 4.3 release.  We're not sure yet if #3 
> is possible; even if it is, it may have unpredictable knock-on effects.
> 
> Doing #2, it is true that many guests will be unable to access the 
> device because of 32-bit limitations.  However, in #1, *no* guests will 
> be able to access the device.  At least in #2, *many* guests will be 
> able to do so.  In any case, apparently #2 is what KVM does, so having 
> the limitation on guests is not without precedent.  It's also likely to 
> be a somewhat tested configuration (unlike #3, for example).

That's all fine with me. My objection was to Stefano's consideration
to assign high addresses to _all_ 64-bit capable BARs up, not just
the biggest one(s).

Jan

WARNING: multiple messages have this Message-ID (diff)
From: "Jan Beulich" <JBeulich@suse.com>
To: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Tim Deegan <tim@xen.org>, Yongjie Ren <yongjie.ren@intel.com>,
	yanqiangjun@huawei.com, Keir Fraser <keir@xen.org>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	hanweidong@huawei.com, Xudong Hao <xudong.hao@intel.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
	luonengjun@huawei.com, qemu-devel@nongnu.org,
	wangzhenguo@huawei.com, xiaowei.yang@huawei.com,
	arei.gonglei@huawei.com, Paolo Bonzini <pbonzini@redhat.com>,
	YongweiX Xu <yongweix.xu@intel.com>,
	SongtaoX Liu <songtaox.liu@intel.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>
Subject: Re: [Xen-devel] [BUG 1747]Guest could't find bootable device with memory more than 3600M
Date: Wed, 12 Jun 2013 11:11:31 +0100	[thread overview]
Message-ID: <51B8657302000078000DD7FD@nat28.tlf.novell.com> (raw)
In-Reply-To: <51B847E3.5010604@eu.citrix.com>

>>> On 12.06.13 at 12:05, George Dunlap <george.dunlap@eu.citrix.com> wrote:
> On 12/06/13 08:25, Jan Beulich wrote:
>>>>> On 11.06.13 at 19:26, Stefano Stabellini <stefano.stabellini@eu.citrix.com> 
> wrote:
>>> I went through the code that maps the PCI MMIO regions in hvmloader
>>> (tools/firmware/hvmloader/pci.c:pci_setup) and it looks like it already
>>> maps the PCI region to high memory if the PCI bar is 64-bit and the MMIO
>>> region is larger than 512MB.
>>>
>>> Maybe we could just relax this condition and map the device memory to
>>> high memory no matter the size of the MMIO region if the PCI bar is
>>> 64-bit?
>> I can only recommend not to: For one, guests not using PAE or
>> PSE-36 can't map such space at all (and older OSes may not
>> properly deal with 64-bit BARs at all). And then one would generally
>> expect this allocation to be done top down (to minimize risk of
>> running into RAM), and doing so is going to present further risks of
>> incompatibilities with guest OSes (Linux for example learned only in
>> 2.6.36 that PFNs in ioremap() can exceed 32 bits, but even in
>> 3.10-rc5 ioremap_pte_range(), while using "u64 pfn", passes the
>> PFN to pfn_pte(), the respective parameter of which is
>> "unsigned long").
>>
>> I think this ought to be done in an iterative process - if all MMIO
>> regions together don't fit below 4G, the biggest one should be
>> moved up beyond 4G first, followed by the next to biggest one
>> etc.
> 
> First of all, the proposal to move the PCI BAR up to the 64-bit range is 
> a temporary work-around.  It should only be done if a device doesn't fit 
> in the current MMIO range.
> 
> We have three options here:
> 1. Don't do anything
> 2. Have hvmloader move PCI devices up to the 64-bit MMIO hole if they 
> don't fit
> 3. Convince qemu to allow MMIO regions to mask memory (or what it thinks 
> is memory).
> 4. Add a mechanism to tell qemu that memory is being relocated.
> 
> Number 4 is definitely the right answer long-term, but we just don't 
> have time to do that before the 4.3 release.  We're not sure yet if #3 
> is possible; even if it is, it may have unpredictable knock-on effects.
> 
> Doing #2, it is true that many guests will be unable to access the 
> device because of 32-bit limitations.  However, in #1, *no* guests will 
> be able to access the device.  At least in #2, *many* guests will be 
> able to do so.  In any case, apparently #2 is what KVM does, so having 
> the limitation on guests is not without precedent.  It's also likely to 
> be a somewhat tested configuration (unlike #3, for example).

That's all fine with me. My objection was to Stefano's consideration
to assign high addresses to _all_ 64-bit capable BARs up, not just
the biggest one(s).

Jan

  reply	other threads:[~2013-06-12 10:11 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-07  9:20 [BUG 1747]Guest could't find bootable device with memory more than 3600M Xu, YongweiX
2013-06-07 12:15 ` Stefano Stabellini
2013-06-07 15:42   ` George Dunlap
2013-06-07 15:56     ` Stefano Stabellini
2013-06-08  7:27       ` Hao, Xudong
2013-06-10 11:49         ` George Dunlap
2013-06-11 17:26           ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2013-06-11 17:26             ` Stefano Stabellini
2013-06-12  7:25             ` [Qemu-devel] " Jan Beulich
2013-06-12  7:25               ` Jan Beulich
2013-06-12  8:31               ` [Qemu-devel] " Ian Campbell
2013-06-12  8:31                 ` Ian Campbell
2013-06-12  9:02                 ` [Qemu-devel] " Jan Beulich
2013-06-12  9:02                   ` Jan Beulich
2013-06-12  9:22                   ` [Qemu-devel] " Ian Campbell
2013-06-12  9:22                     ` Ian Campbell
2013-06-12 10:07                     ` [Qemu-devel] [Xen-devel] " Jan Beulich
2013-06-12 10:07                       ` Jan Beulich
2013-06-12 11:23                       ` [Qemu-devel] " Ian Campbell
2013-06-12 11:23                         ` Ian Campbell
2013-06-12 11:56                         ` [Qemu-devel] " Jan Beulich
2013-06-12 11:56                           ` Jan Beulich
2013-06-12 11:59                           ` [Qemu-devel] " Ian Campbell
2013-06-12 11:59                             ` Ian Campbell
2013-06-12 10:05               ` [Qemu-devel] " George Dunlap
2013-06-12 10:05                 ` George Dunlap
2013-06-12 10:11                 ` Jan Beulich [this message]
2013-06-12 10:11                   ` Jan Beulich
2013-06-12 10:15                   ` [Qemu-devel] " George Dunlap
2013-06-12 10:15                     ` George Dunlap
2013-06-12 13:23                 ` [Qemu-devel] " Paolo Bonzini
2013-06-12 13:23                   ` Paolo Bonzini
2013-06-12 13:49                   ` [Qemu-devel] " Jan Beulich
2013-06-12 13:49                     ` Jan Beulich
2013-06-12 14:02                     ` [Qemu-devel] " Paolo Bonzini
2013-06-12 14:02                       ` Paolo Bonzini
2013-06-12 14:19                       ` [Qemu-devel] " Jan Beulich
2013-06-12 14:19                         ` Jan Beulich
2013-06-12 15:25                         ` [Qemu-devel] " George Dunlap
2013-06-12 15:25                           ` George Dunlap
2013-06-12 20:13                           ` [Qemu-devel] " Paolo Bonzini
2013-06-12 20:13                             ` Paolo Bonzini
2013-06-13 13:44                 ` [Qemu-devel] " Stefano Stabellini
2013-06-13 13:44                   ` Stefano Stabellini
2013-06-13 13:54                   ` [Qemu-devel] " George Dunlap
2013-06-13 13:54                     ` George Dunlap
2013-06-13 14:50                     ` [Qemu-devel] " Stefano Stabellini
2013-06-13 14:50                       ` Stefano Stabellini
2013-06-13 15:06                       ` [Qemu-devel] [Xen-devel] " Jan Beulich
2013-06-13 15:06                         ` Jan Beulich
2013-06-13 15:29                       ` [Qemu-devel] [Xen-devel] " George Dunlap
2013-06-13 15:29                         ` George Dunlap
2013-06-13 16:13                         ` [Qemu-devel] " Stefano Stabellini
2013-06-13 16:13                           ` Stefano Stabellini
2013-06-13 15:34                       ` [Qemu-devel] " Ian Campbell
2013-06-13 15:34                         ` Ian Campbell
2013-06-13 16:55                         ` [Qemu-devel] " Stefano Stabellini
2013-06-13 16:55                           ` Stefano Stabellini
2013-06-13 17:22                           ` [Qemu-devel] " Ian Campbell
2013-06-13 17:22                             ` Ian Campbell
2013-06-14 10:53                             ` [Qemu-devel] " George Dunlap
2013-06-14 10:53                               ` George Dunlap
2013-06-14 11:34                               ` [Qemu-devel] [Xen-devel] " Ian Campbell
2013-06-14 11:34                                 ` Ian Campbell
2013-06-14 14:14                                 ` [Qemu-devel] " George Dunlap
2013-06-14 14:14                                   ` George Dunlap
2013-06-14 14:36                                   ` [Qemu-devel] " George Dunlap
2013-06-14 14:36                                     ` George Dunlap
2013-06-13 14:54                     ` [Qemu-devel] " Paolo Bonzini
2013-06-13 14:54                       ` Paolo Bonzini
2013-06-13 15:16                     ` [Qemu-devel] [Xen-devel] " Ian Campbell
2013-06-13 15:16                       ` Ian Campbell
2013-06-13 15:30                       ` [Qemu-devel] [Xen-devel] " George Dunlap
2013-06-13 15:30                         ` George Dunlap
2013-06-13 15:36                         ` [Qemu-devel] " Ian Campbell
2013-06-13 15:36                           ` Ian Campbell
2013-06-13 15:40                           ` [Qemu-devel] " George Dunlap
2013-06-13 15:40                             ` George Dunlap
2013-06-13 15:42                             ` [Qemu-devel] " Ian Campbell
2013-06-13 15:42                               ` Ian Campbell
2013-06-13 15:40                       ` [Qemu-devel] " Stefano Stabellini
2013-06-13 15:40                         ` Stefano Stabellini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B8657302000078000DD7FD@nat28.tlf.novell.com \
    --to=jbeulich@suse.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=arei.gonglei@huawei.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=hanweidong@huawei.com \
    --cc=keir@xen.org \
    --cc=luonengjun@huawei.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=songtaox.liu@intel.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=tim@xen.org \
    --cc=wangzhenguo@huawei.com \
    --cc=xen-devel@lists.xensource.com \
    --cc=xiaowei.yang@huawei.com \
    --cc=xudong.hao@intel.com \
    --cc=yanqiangjun@huawei.com \
    --cc=yongjie.ren@intel.com \
    --cc=yongweix.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.