All of lore.kernel.org
 help / color / mirror / Atom feed
* QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
@ 2015-03-03 14:29 Gordan Bobic
  2015-03-04 10:11 ` Stefano Stabellini
  0 siblings, 1 reply; 8+ messages in thread
From: Gordan Bobic @ 2015-03-03 14:29 UTC (permalink / raw)
  To: xen-devel

Hi,

I've been looking into custom e820 maps for domUs again, and
found that functionality to provide QEMU with hints regarding
e820 mapping has been upstream since some time in
2010 (FW_CFG_E820_TABLE) with more finely grained control
(usable rather than just reserved entries) upstream since
2013 (fw_cfg etc/e820).

The respective patches are here:
http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html

What I have not been able to find is any documentation at
all on how this e820 data can be given to QEMU when starting
a domain. I can see from the structs in the patches how the
data is packed for the relevant code to consume, but I cannot
figure out what is the delivery vector for this data. How
can I get QEMU to ingest the hints about any additional
reserved e820 blocks?

For context, I need this to work around IOMMU implementation
bugs and mark areas of address space as reserved so that
the guest doesn't trample over the host's PCI I/O ranges
(which IOMMU should intercept, but being buggy, it doesn't).

Many thanks in advance.

Gordan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-03 14:29 QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820) Gordan Bobic
@ 2015-03-04 10:11 ` Stefano Stabellini
  2015-03-04 10:25   ` Gordan Bobic
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Stabellini @ 2015-03-04 10:11 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: xen-devel

On Tue, 3 Mar 2015, Gordan Bobic wrote:
> Hi,
> 
> I've been looking into custom e820 maps for domUs again, and
> found that functionality to provide QEMU with hints regarding
> e820 mapping has been upstream since some time in
> 2010 (FW_CFG_E820_TABLE) with more finely grained control
> (usable rather than just reserved entries) upstream since
> 2013 (fw_cfg etc/e820).
> 
> The respective patches are here:
> http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
> http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html
> 
> What I have not been able to find is any documentation at
> all on how this e820 data can be given to QEMU when starting
> a domain. I can see from the structs in the patches how the
> data is packed for the relevant code to consume, but I cannot
> figure out what is the delivery vector for this data. How
> can I get QEMU to ingest the hints about any additional
> reserved e820 blocks?
> 
> For context, I need this to work around IOMMU implementation
> bugs and mark areas of address space as reserved so that
> the guest doesn't trample over the host's PCI I/O ranges
> (which IOMMU should intercept, but being buggy, it doesn't).
> 
> Many thanks in advance.

Hello Gordan,

FW_CFG_E820_TABLE is a special interface between SeaBios and QEMU but is
not used on Xen. I guess it could be made to work on Xen, but I am
pretty sure it doesn't at the moment.

I think you would probably want to look at hvmloader instead:
tools/firmware/hvmloader/e820.c.

Cheers,

Stefano

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-04 10:11 ` Stefano Stabellini
@ 2015-03-04 10:25   ` Gordan Bobic
  2015-03-04 10:33     ` Stefano Stabellini
  0 siblings, 1 reply; 8+ messages in thread
From: Gordan Bobic @ 2015-03-04 10:25 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

Stefano,

Many thanks for responding to this. Resplies inline below.

On 2015-03-04 10:11, Stefano Stabellini wrote:
> On Tue, 3 Mar 2015, Gordan Bobic wrote:
>> Hi,
>> 
>> I've been looking into custom e820 maps for domUs again, and
>> found that functionality to provide QEMU with hints regarding
>> e820 mapping has been upstream since some time in
>> 2010 (FW_CFG_E820_TABLE) with more finely grained control
>> (usable rather than just reserved entries) upstream since
>> 2013 (fw_cfg etc/e820).
>> 
>> The respective patches are here:
>> http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
>> http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html
>> 
>> What I have not been able to find is any documentation at
>> all on how this e820 data can be given to QEMU when starting
>> a domain. I can see from the structs in the patches how the
>> data is packed for the relevant code to consume, but I cannot
>> figure out what is the delivery vector for this data. How
>> can I get QEMU to ingest the hints about any additional
>> reserved e820 blocks?
>> 
>> For context, I need this to work around IOMMU implementation
>> bugs and mark areas of address space as reserved so that
>> the guest doesn't trample over the host's PCI I/O ranges
>> (which IOMMU should intercept, but being buggy, it doesn't).
>> 
>> Many thanks in advance.
> 
> Hello Gordan,
> 
> FW_CFG_E820_TABLE is a special interface between SeaBios and QEMU but 
> is
> not used on Xen. I guess it could be made to work on Xen, but I am
> pretty sure it doesn't at the moment.

So this cannot be used to side-load an additional list of
e820 reserved memory blocks at domU startup time?

> I think you would probably want to look at hvmloader instead:
> tools/firmware/hvmloader/e820.c.

Yes, this is what I was looking at last time. I was just
hoping that either of the above mentioned patches could
be used to adjust the e820 map in a "soft" way rather
hard-coding any changes into hvmloader/e820.c The latter
is what I did last time, but it is extremely ugly and
non-generic.

And given the two interfaces I mentioned above it seems
really wrong to be implementing a whole new method for
manually loading an explicit e820 map. Is that not what
the etc/e820 interface is already supposed to do?

Gordan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-04 10:25   ` Gordan Bobic
@ 2015-03-04 10:33     ` Stefano Stabellini
  2015-03-04 10:38       ` Gordan Bobic
  0 siblings, 1 reply; 8+ messages in thread
From: Stefano Stabellini @ 2015-03-04 10:33 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: xen-devel, Stefano Stabellini

On Wed, 4 Mar 2015, Gordan Bobic wrote:
> Stefano,
> 
> Many thanks for responding to this. Resplies inline below.
> 
> On 2015-03-04 10:11, Stefano Stabellini wrote:
> > On Tue, 3 Mar 2015, Gordan Bobic wrote:
> > > Hi,
> > > 
> > > I've been looking into custom e820 maps for domUs again, and
> > > found that functionality to provide QEMU with hints regarding
> > > e820 mapping has been upstream since some time in
> > > 2010 (FW_CFG_E820_TABLE) with more finely grained control
> > > (usable rather than just reserved entries) upstream since
> > > 2013 (fw_cfg etc/e820).
> > > 
> > > The respective patches are here:
> > > http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
> > > http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html
> > > 
> > > What I have not been able to find is any documentation at
> > > all on how this e820 data can be given to QEMU when starting
> > > a domain. I can see from the structs in the patches how the
> > > data is packed for the relevant code to consume, but I cannot
> > > figure out what is the delivery vector for this data. How
> > > can I get QEMU to ingest the hints about any additional
> > > reserved e820 blocks?
> > > 
> > > For context, I need this to work around IOMMU implementation
> > > bugs and mark areas of address space as reserved so that
> > > the guest doesn't trample over the host's PCI I/O ranges
> > > (which IOMMU should intercept, but being buggy, it doesn't).
> > > 
> > > Many thanks in advance.
> > 
> > Hello Gordan,
> > 
> > FW_CFG_E820_TABLE is a special interface between SeaBios and QEMU but is
> > not used on Xen. I guess it could be made to work on Xen, but I am
> > pretty sure it doesn't at the moment.
> 
> So this cannot be used to side-load an additional list of
> e820 reserved memory blocks at domU startup time?

Nope.
Enabling the usage of FW_CFG_E820_TABLE on Xen is conceivable and once
done you would be able to use it, but today it would not work.


> > I think you would probably want to look at hvmloader instead:
> > tools/firmware/hvmloader/e820.c.
> 
> Yes, this is what I was looking at last time. I was just
> hoping that either of the above mentioned patches could
> be used to adjust the e820 map in a "soft" way rather
> hard-coding any changes into hvmloader/e820.c The latter
> is what I did last time, but it is extremely ugly and
> non-generic.

I see.


> And given the two interfaces I mentioned above it seems
> really wrong to be implementing a whole new method for
> manually loading an explicit e820 map. Is that not what
> the etc/e820 interface is already supposed to do?

I don't follow you here: what is the etc/e820 interface?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-04 10:33     ` Stefano Stabellini
@ 2015-03-04 10:38       ` Gordan Bobic
  2015-03-04 10:50         ` Ian Campbell
  2015-03-04 10:50         ` Stefano Stabellini
  0 siblings, 2 replies; 8+ messages in thread
From: Gordan Bobic @ 2015-03-04 10:38 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel

On 2015-03-04 10:33, Stefano Stabellini wrote:
> On Wed, 4 Mar 2015, Gordan Bobic wrote:
>> Stefano,
>> 
>> Many thanks for responding to this. Resplies inline below.
>> 
>> On 2015-03-04 10:11, Stefano Stabellini wrote:
>> > On Tue, 3 Mar 2015, Gordan Bobic wrote:
>> > > Hi,
>> > >
>> > > I've been looking into custom e820 maps for domUs again, and
>> > > found that functionality to provide QEMU with hints regarding
>> > > e820 mapping has been upstream since some time in
>> > > 2010 (FW_CFG_E820_TABLE) with more finely grained control
>> > > (usable rather than just reserved entries) upstream since
>> > > 2013 (fw_cfg etc/e820).
>> > >
>> > > The respective patches are here:
>> > > http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
>> > > http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html
>> > >
>> > > What I have not been able to find is any documentation at
>> > > all on how this e820 data can be given to QEMU when starting
>> > > a domain. I can see from the structs in the patches how the
>> > > data is packed for the relevant code to consume, but I cannot
>> > > figure out what is the delivery vector for this data. How
>> > > can I get QEMU to ingest the hints about any additional
>> > > reserved e820 blocks?
>> > >
>> > > For context, I need this to work around IOMMU implementation
>> > > bugs and mark areas of address space as reserved so that
>> > > the guest doesn't trample over the host's PCI I/O ranges
>> > > (which IOMMU should intercept, but being buggy, it doesn't).
>> > >
>> > > Many thanks in advance.
>> >
>> > Hello Gordan,
>> >
>> > FW_CFG_E820_TABLE is a special interface between SeaBios and QEMU but is
>> > not used on Xen. I guess it could be made to work on Xen, but I am
>> > pretty sure it doesn't at the moment.
>> 
>> So this cannot be used to side-load an additional list of
>> e820 reserved memory blocks at domU startup time?
> 
> Nope.
> Enabling the usage of FW_CFG_E820_TABLE on Xen is conceivable and once
> done you would be able to use it, but today it would not work.
> 
> 
>> > I think you would probably want to look at hvmloader instead:
>> > tools/firmware/hvmloader/e820.c.
>> 
>> Yes, this is what I was looking at last time. I was just
>> hoping that either of the above mentioned patches could
>> be used to adjust the e820 map in a "soft" way rather
>> hard-coding any changes into hvmloader/e820.c The latter
>> is what I did last time, but it is extremely ugly and
>> non-generic.
> 
> I see.
> 
> 
>> And given the two interfaces I mentioned above it seems
>> really wrong to be implementing a whole new method for
>> manually loading an explicit e820 map. Is that not what
>> the etc/e820 interface is already supposed to do?
> 
> I don't follow you here: what is the etc/e820 interface?

See the 2nd patch I mentioned above, which supposedly
adds "etc/e820 fw_cfg file" (whatever that means).

Gordan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-04 10:38       ` Gordan Bobic
@ 2015-03-04 10:50         ` Ian Campbell
  2015-03-04 10:50         ` Stefano Stabellini
  1 sibling, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2015-03-04 10:50 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: xen-devel, Stefano Stabellini

On Wed, 2015-03-04 at 10:38 +0000, Gordan Bobic wrote:
> > I don't follow you here: what is the etc/e820 interface?
> 
> See the 2nd patch I mentioned above, which supposedly
> adds "etc/e820 fw_cfg file" (whatever that means).

The problem here is that hvmloader controls and can make modifications
to the memory map, and therefore needs to be the entity which provides
the e820 to SeaBIOS. etc/e820 comes from qemu which does not have full
information.

I suppose we could enable etc/e820 for those who really know what they
are doing, but it seems like it would be open to abuse (and resulting
bug reports to us).

Xen already has XENMEM_set_memory_map and XENMEM_memory_map which the
tools can use to pass a memory map to the guest (e.g. where hvmloader
could retrieve it as a baseline).

I think today this is only really used with pv guests using e820_host=1
option. Perhaps that could be extended to HVM guests? Or Perhaps a
separate (xenstore based?) mechanism for the toolstack to blacklist
memory regions in the guest so hvmloader can take those into account?

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-04 10:38       ` Gordan Bobic
  2015-03-04 10:50         ` Ian Campbell
@ 2015-03-04 10:50         ` Stefano Stabellini
  2015-03-04 11:04           ` Ian Campbell
  1 sibling, 1 reply; 8+ messages in thread
From: Stefano Stabellini @ 2015-03-04 10:50 UTC (permalink / raw)
  To: Gordan Bobic; +Cc: xen-devel, Stefano Stabellini

On Wed, 4 Mar 2015, Gordan Bobic wrote:
> On 2015-03-04 10:33, Stefano Stabellini wrote:
> > On Wed, 4 Mar 2015, Gordan Bobic wrote:
> > > Stefano,
> > > 
> > > Many thanks for responding to this. Resplies inline below.
> > > 
> > > On 2015-03-04 10:11, Stefano Stabellini wrote:
> > > > On Tue, 3 Mar 2015, Gordan Bobic wrote:
> > > > > Hi,
> > > > >
> > > > > I've been looking into custom e820 maps for domUs again, and
> > > > > found that functionality to provide QEMU with hints regarding
> > > > > e820 mapping has been upstream since some time in
> > > > > 2010 (FW_CFG_E820_TABLE) with more finely grained control
> > > > > (usable rather than just reserved entries) upstream since
> > > > > 2013 (fw_cfg etc/e820).
> > > > >
> > > > > The respective patches are here:
> > > > > http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
> > > > > http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html
> > > > >
> > > > > What I have not been able to find is any documentation at
> > > > > all on how this e820 data can be given to QEMU when starting
> > > > > a domain. I can see from the structs in the patches how the
> > > > > data is packed for the relevant code to consume, but I cannot
> > > > > figure out what is the delivery vector for this data. How
> > > > > can I get QEMU to ingest the hints about any additional
> > > > > reserved e820 blocks?
> > > > >
> > > > > For context, I need this to work around IOMMU implementation
> > > > > bugs and mark areas of address space as reserved so that
> > > > > the guest doesn't trample over the host's PCI I/O ranges
> > > > > (which IOMMU should intercept, but being buggy, it doesn't).
> > > > >
> > > > > Many thanks in advance.
> > > >
> > > > Hello Gordan,
> > > >
> > > > FW_CFG_E820_TABLE is a special interface between SeaBios and QEMU but is
> > > > not used on Xen. I guess it could be made to work on Xen, but I am
> > > > pretty sure it doesn't at the moment.
> > > 
> > > So this cannot be used to side-load an additional list of
> > > e820 reserved memory blocks at domU startup time?
> > 
> > Nope.
> > Enabling the usage of FW_CFG_E820_TABLE on Xen is conceivable and once
> > done you would be able to use it, but today it would not work.
> > 
> > 
> > > > I think you would probably want to look at hvmloader instead:
> > > > tools/firmware/hvmloader/e820.c.
> > > 
> > > Yes, this is what I was looking at last time. I was just
> > > hoping that either of the above mentioned patches could
> > > be used to adjust the e820 map in a "soft" way rather
> > > hard-coding any changes into hvmloader/e820.c The latter
> > > is what I did last time, but it is extremely ugly and
> > > non-generic.
> > 
> > I see.
> > 
> > 
> > > And given the two interfaces I mentioned above it seems
> > > really wrong to be implementing a whole new method for
> > > manually loading an explicit e820 map. Is that not what
> > > the etc/e820 interface is already supposed to do?
> > 
> > I don't follow you here: what is the etc/e820 interface?
> 
> See the 2nd patch I mentioned above, which supposedly
> adds "etc/e820 fw_cfg file" (whatever that means).

Ah I see. Yes, this looks pretty much like what you need. We could
expose something similar from xl/libxl and make use of this QEMU
functionality on Xen. We would also need to coordinate with hvmloader
but it should be doable.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820)
  2015-03-04 10:50         ` Stefano Stabellini
@ 2015-03-04 11:04           ` Ian Campbell
  0 siblings, 0 replies; 8+ messages in thread
From: Ian Campbell @ 2015-03-04 11:04 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Gordan Bobic, xen-devel

On Wed, 2015-03-04 at 10:50 +0000, Stefano Stabellini wrote:
> On Wed, 4 Mar 2015, Gordan Bobic wrote:
> > On 2015-03-04 10:33, Stefano Stabellini wrote:
> > > On Wed, 4 Mar 2015, Gordan Bobic wrote:
> > > > Stefano,
> > > > 
> > > > Many thanks for responding to this. Resplies inline below.
> > > > 
> > > > On 2015-03-04 10:11, Stefano Stabellini wrote:
> > > > > On Tue, 3 Mar 2015, Gordan Bobic wrote:
> > > > > > Hi,
> > > > > >
> > > > > > I've been looking into custom e820 maps for domUs again, and
> > > > > > found that functionality to provide QEMU with hints regarding
> > > > > > e820 mapping has been upstream since some time in
> > > > > > 2010 (FW_CFG_E820_TABLE) with more finely grained control
> > > > > > (usable rather than just reserved entries) upstream since
> > > > > > 2013 (fw_cfg etc/e820).
> > > > > >
> > > > > > The respective patches are here:
> > > > > > http://lists.gnu.org/archive/html/qemu-devel/2010-02/msg00996.html
> > > > > > http://lists.gnu.org/archive/html/qemu-devel/2013-11/msg00593.html
> > > > > >
> > > > > > What I have not been able to find is any documentation at
> > > > > > all on how this e820 data can be given to QEMU when starting
> > > > > > a domain. I can see from the structs in the patches how the
> > > > > > data is packed for the relevant code to consume, but I cannot
> > > > > > figure out what is the delivery vector for this data. How
> > > > > > can I get QEMU to ingest the hints about any additional
> > > > > > reserved e820 blocks?
> > > > > >
> > > > > > For context, I need this to work around IOMMU implementation
> > > > > > bugs and mark areas of address space as reserved so that
> > > > > > the guest doesn't trample over the host's PCI I/O ranges
> > > > > > (which IOMMU should intercept, but being buggy, it doesn't).
> > > > > >
> > > > > > Many thanks in advance.
> > > > >
> > > > > Hello Gordan,
> > > > >
> > > > > FW_CFG_E820_TABLE is a special interface between SeaBios and QEMU but is
> > > > > not used on Xen. I guess it could be made to work on Xen, but I am
> > > > > pretty sure it doesn't at the moment.
> > > > 
> > > > So this cannot be used to side-load an additional list of
> > > > e820 reserved memory blocks at domU startup time?
> > > 
> > > Nope.
> > > Enabling the usage of FW_CFG_E820_TABLE on Xen is conceivable and once
> > > done you would be able to use it, but today it would not work.
> > > 
> > > 
> > > > > I think you would probably want to look at hvmloader instead:
> > > > > tools/firmware/hvmloader/e820.c.
> > > > 
> > > > Yes, this is what I was looking at last time. I was just
> > > > hoping that either of the above mentioned patches could
> > > > be used to adjust the e820 map in a "soft" way rather
> > > > hard-coding any changes into hvmloader/e820.c The latter
> > > > is what I did last time, but it is extremely ugly and
> > > > non-generic.
> > > 
> > > I see.
> > > 
> > > 
> > > > And given the two interfaces I mentioned above it seems
> > > > really wrong to be implementing a whole new method for
> > > > manually loading an explicit e820 map. Is that not what
> > > > the etc/e820 interface is already supposed to do?
> > > 
> > > I don't follow you here: what is the etc/e820 interface?
> > 
> > See the 2nd patch I mentioned above, which supposedly
> > adds "etc/e820 fw_cfg file" (whatever that means).
> 
> Ah I see. Yes, this looks pretty much like what you need. We could
> expose something similar from xl/libxl and make use of this QEMU
> functionality on Xen. We would also need to coordinate with hvmloader
> but it should be doable.

I think going via qemu for this will turn out to be rather complex,
without a wholesale reworking of how (and by whom) the guest address map
is laid out.

Ian.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-04 11:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-03 14:29 QEMU e820 Reservation (FW_CFG_E820_TABLE and fw_cfg etc/e820) Gordan Bobic
2015-03-04 10:11 ` Stefano Stabellini
2015-03-04 10:25   ` Gordan Bobic
2015-03-04 10:33     ` Stefano Stabellini
2015-03-04 10:38       ` Gordan Bobic
2015-03-04 10:50         ` Ian Campbell
2015-03-04 10:50         ` Stefano Stabellini
2015-03-04 11:04           ` Ian Campbell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.