From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: "Matt Fleming" <matt@codeblueprint.co.uk>,
"Michael Chang" <MChang@suse.com>,
linux-kernel@vger.kernel.org,
"Julien Grall" <julien.grall@arm.com>,
"Jan Beulich" <JBeulich@suse.com>,
"H. Peter Anvin" <hpa@zytor.com>,
"Daniel Kiper" <daniel.kiper@oracle.com>,
x86@kernel.org, "Vojtěch Pavlík" <vojtech@suse.cz>,
"Gary Lin" <GLin@suse.com>,
xen-devel@lists.xenproject.org,
"Jeffrey Cheung" <JCheung@suse.com>,
"Charles Arndol" <carnold@suse.com>,
"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>,
joeyli <jlee@suse.com>, "Borislav Petkov" <bp@alien8.de>,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
"Juergen Gross" <jgross@suse.com>,
"Andrew Cooper" <andrew.cooper3@citrix.com>,
"Jim Fehlig" <jfehlig@suse.com>,
"Andy Lutomirski" <luto@amacapital.net>,
"David Vrabel" <david.vrabel@citrix.com>,
"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: Re: HVMLite / PVHv2 - using x86 EFI boot entry
Date: Wed, 13 Apr 2016 14:56:29 -0400 [thread overview]
Message-ID: <20160413185629.GA7501@char.us.oracle.com> (raw)
In-Reply-To: <20160413182951.GW1990@wotan.suse.de>
On Wed, Apr 13, 2016 at 08:29:51PM +0200, Luis R. Rodriguez wrote:
> On Mon, Apr 11, 2016 at 07:12:08AM +0200, Juergen Gross wrote:
> > On 08/04/16 22:40, Luis R. Rodriguez wrote:
> > > On Wed, Apr 06, 2016 at 10:40:08AM +0100, David Vrabel wrote:
> > >> On 06/04/16 03:40, Luis R. Rodriguez wrote:
> > >>>
> > >>> * You don't need full EFI emulation
> > >>
> > >> I think needing any EFI emulation inside Xen (which is where it would
> > >> need to be for dom0) is not suitable because of the increase in
> > >> hypervisor ABI.
> > >
> > > Is this because of timing on architecture / design of HVMLite, or
> > > a general position that the complexity to deal with EFI emulation
> > > is too much for Xen's taste ?
> >
> > The Xen hypervisor should be as small as possible. Adding an EFI
> > emulator will be adding quite some code. This should be done after a
> > very thorough evaluation only.
>
> Sure.
>
> > > ARM already went the EFI entry way for domU -- it went the OVMF route,
> > > would such a possibility be possible for x86 domU HVMLite ? If not why
> > > not, I mean it would seem to make sense to at least mimic the same type
> > > of early boot environment, and perhaps there are some lessons to be
> > > learned from that effort too.
> >
> > The final solution must be appropriate for dom0, too. So don't try
> > to limit the discussion to domU. If dom0 isn't going to be acceptable
> > there will no need to discuss domU.
>
> Understood. George noted that on ARM dom0 still uses the ARM native entry
> point, it seems to accomplish this as it uses a device tree node. I'll
> chime in on that in another thread.
>
> > > Are there some lessons to be learned with ARM's effort? What are they?
> > > If that could be re-done again with any type of cleaner path, what
> > > could that be that could help the x86 side ?
> > >
> > > Although emulating EFI may require work, some folks have pointed out
> > > that the amount of work may not be that much. If that is done can
> > > we instead rely on the same code to replace OVMF to support both
> > > Xen ARM and Xen HVMLite on x86 ? What would be the pros / cons of
> > > this ?
> > >
> > >> I also still do not understand your objection to the current tiny stub.
> > >
> > > Its more of a hypothetical -- can an EFI entry be used instead given
> > > it already does exactly what the new small entry does ? Its also rather
> > > odd to add a new entry without evaluating fully a possible alternative
> > > that would provide the same exact mechanism.
> >
> > The interface isn't the new entry only. It should be evaluated how much
> > of the early EFI boot path would be common to the HVMlite one.
>
> We also have other asm code which can be shared. I'll reply to Boris'
> original e-mail with what I can identify as perhaps sharable. There is
> obviously more as you allude.
>
> > What would be gained by using the same entry but having two different boot
> > paths after it?
>
> Its a good question. In summary for me it would be the push for sharing more
> code and the push for semantics on early boot to address differences
> proactively, and ultimately it may enable us to help bring closer the old PV
> boot path closer.
But why? We want to kill PV (eventually).
>
> I'll elaborate on this but first let's clarify why a new entry is used for
> HVMlite to start of with:
>
> 1) Xen ABI has historically not wanted to set up the boot params for Linux
> guests, instead it insists on letting the Linux kernel Xen boot stubs fill
> that out for it. This sticking point means it has implicated a boot stub.
Which is b/c it has to be OS agnostic. It has nothing to do 'not wanting'.
> The HVMLite boot entry tries to bring the boot entries paths closer as it
> leverages more of the HVM boot path philosophy to mimic the regular PC boot
> path.
>
> Is HVMLite supposed to support legacy PV guests as well BTW ?
Gosh no.
>
> Reason I'm highlighting Xen ABI as a *reason* alone is that even with
> today's large discrepancy on the old PV boot path I believe we can
> bring together the boot paths closer together if the Xen ABI was slightly
> flexible about this, I've highlighted how I believe that is possible before,
<runs away screaming>
> *iff* the Xen ABI would at the very least set 2 things only:
>
> a) Hypervisor type
> b) A custom data pointer
>
> This would enable a single boot entry on the guest to handle then:
>
> Pseudo code:
>
> startup_32() startup_64()
> | |
> | |
> V V
> pre_hypervisor_stub_32() pre_hypervisor_stub_64()
> | |
> | |
> V V
> [existing startup_32()] [existing startup_64()]
> | |
> | |
> V V
> post_hypervisor_stub_32() post_hypervisor_stub_64()
>
>
> If the Xen ABI was flexible about setting a hypervisor type and custom
> data pointer then we would haven handlers for it, and in it, it can
> do whatever it thinks is needed for its own guest types. It could
> also continue to set the zero page on its own as it sees fit.
>
> Again, note that if this is done it could also mean even bringing together
> the old PV boot path closer together... so this is not just a prospect
> for HVMLite but also for old PV guests.
>
> 2) Because of 1) it has meant we have no formal semantics for early boot
> code is available and so severe differences can best be addressed also
> by yet another boot entry. This has meant often times not addressing
There are semantics written for this new code: http://xenbits.xen.org/docs/unstable/misc/hvmlite.html
All other ones related to low-level operations are described in Intel SDM.
> or not knowing if we've addressed real differences between the different
> entries. Case in point, dead code [0]. How do we know we will not run
> certain code that should not run for the different entries ? Without
> *any* semantics later in boot code to distinguish where we came from
> and because we strive to build single kernels with different possible
> run time environments it means we have tons of code available to
> execute / run that we may not need.
I am not following that. PVH aka HVMLite will pretty much erase the need for the
pvops.
>
> Because of the lack of semantics we may still have dead code prospects
> with the new HVMLite entry. How are we sure there is no differences ?
>
> [0] http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html
>
> 3) Unikernel / other OS requirements: this is really tied to 2) but even if
> we tried to evolve the Xen ABI it would mean considering existing solutions
> out there. Things to consider as an example: FreeBSD doesn't have an EFI
> entry, unikernels want a simple boot entry.
>
> With this in mind then, that I can think of:
>
> Cons of using the same entry but having two different boot paths:
>
> * Pushes the Xen ABI, needs to make everyone happy, this is hard
> * Perhaps harder to implement
>
> Gains of striving to use the same entry but having two different boot:
>
> * Helps to share more code easily
> * Reduce attack surface
> * Requires us to have semantics for early boot; this has a series of
> side benefits:
> - Means you should try to address differences explicitly rather than
> implicitly -- case in point Dead Code
>
> > You still need a way to distinguish between bare metal
> > EFI and HVMlite.
>
> Great point! This is the semantics aspect. The new entry for HVMlite approach
> deals with this by making the differences implicit by the new entry point.
> My call for addressing this through a hypervisor type was to see if we can
> get those semantics added explicitly so we can also later address dead
> code concerns for the new HVMLite guest type.
Right, they are..
>
> Part of my own interest in an EFI entry here is that EFI could be used to help
> expand on the semantics in an OS/agnostic form rather than pushing the x86 boot
> protocol further. That seems to have its own set of drawbacks though.
>
>
> > And Xen needs a way to find out whether a kernel is
> > supporting HVMlite to boot it in the correct mode.
>
> How was Xen going to find out if new kernels had HVMlite support with the
> new entry ? An ELFNOTE() ? If an entry is shared could we note use an
Yeah.
> ELFNOTE() also for this though too ?
Not sure what you mean by 'shared'. But you can add multiple Elf PT_NOTEs.
See the ELF document.
>
> Luis
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-04-13 18:57 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20160406024027.GX1990@wotan.suse.de>
2016-04-06 9:40 ` HVMLite / PVHv2 - using x86 EFI boot entry David Vrabel
2016-04-06 11:07 ` George Dunlap
2016-04-06 11:11 ` Daniel Kiper
[not found] ` <CAFLBxZbRjB6QWH5GbG6osCXat9NQVUAyDYrAMrdALbCofpX3Dg@mail.gmail.com>
2016-04-06 15:02 ` Matt Fleming
2016-04-07 18:51 ` Luis R. Rodriguez
[not found] ` <20160406150240.GO2701@codeblueprint.co.uk>
2016-04-06 16:05 ` Konrad Rzeszutek Wilk
2016-04-06 16:23 ` Konrad Rzeszutek Wilk
2016-04-08 21:53 ` Luis R. Rodriguez
2016-04-13 10:03 ` Roger Pau Monné
[not found] ` <20160413100312.647eocdtbmak4btk@mac>
2016-04-13 10:21 ` Matt Fleming
[not found] ` <20160407185148.GL1990@wotan.suse.de>
2016-04-08 14:16 ` George Dunlap
[not found] ` <5707BD2E.20204@citrix.com>
2016-04-08 21:58 ` Luis R. Rodriguez
[not found] ` <20160408215854.GU1990@wotan.suse.de>
2016-04-12 22:12 ` Luis R. Rodriguez
2016-04-13 9:54 ` Roger Pau Monné
[not found] ` <20160412221225.GN1990@wotan.suse.de>
2016-04-13 10:05 ` George Dunlap
2016-04-13 10:25 ` Roger Pau Monné
[not found] ` <CAFLBxZbiGppNad=Z6-fLgx89O0yAFrSyARTCwv=vHBR3zJ=NsA@mail.gmail.com>
2016-04-13 18:54 ` Luis R. Rodriguez
[not found] ` <20160413185451.GY1990@wotan.suse.de>
2016-04-14 9:42 ` George Dunlap
[not found] ` <570F65F7.5050108@citrix.com>
2016-04-14 19:59 ` Luis R. Rodriguez
[not found] ` <20160413102156.b4qwhwbqvnnpmxgw@mac>
2016-04-13 19:10 ` Luis R. Rodriguez
[not found] ` <20160413095428.5mcbrimvc6vxffcw@mac>
2016-04-13 18:50 ` Luis R. Rodriguez
[not found] ` <20160413185010.GX1990@wotan.suse.de>
2016-04-13 19:02 ` Konrad Rzeszutek Wilk
2016-04-13 19:14 ` Luis R. Rodriguez
[not found] ` <20160413191408.GA1990@wotan.suse.de>
2016-04-13 19:22 ` Konrad Rzeszutek Wilk
2016-04-13 20:01 ` Luis R. Rodriguez
[not found] ` <20160413200118.GC1990@wotan.suse.de>
2016-04-13 20:11 ` Konrad Rzeszutek Wilk
2016-04-13 20:35 ` Luis R. Rodriguez
[not found] ` <CAB=NE6VdTB1Bc=c0oCd_tTHpwwkQcxhnOFdcLfck2jX=JjuOAQ@mail.gmail.com>
2016-04-13 20:48 ` Konrad Rzeszutek Wilk
2016-04-14 10:13 ` George Dunlap
2016-04-13 15:44 ` George Dunlap
[not found] ` <CAFLBxZbJ4QyJQ1-ZuXg_Q-9YNXnWzDyPNp4SX=d9g0DS8mJKaw@mail.gmail.com>
2016-04-13 19:52 ` Luis R. Rodriguez
[not found] ` <20160413195257.GB1990@wotan.suse.de>
2016-04-14 9:53 ` George Dunlap
[not found] ` <570F68AB.2040400@citrix.com>
2016-04-14 19:44 ` Luis R. Rodriguez
[not found] ` <20160414194408.GP1990@wotan.suse.de>
2016-04-14 20:38 ` Konrad Rzeszutek Wilk
[not found] ` <20160414203847.GB21657@localhost.localdomain>
2016-04-14 21:12 ` Luis R. Rodriguez
[not found] ` <20160414211201.GS1990@wotan.suse.de>
2016-04-15 2:14 ` Konrad Rzeszutek Wilk
2016-04-15 5:50 ` Juergen Gross
2016-04-15 9:59 ` George Dunlap
[not found] ` <57108121.1070307@suse.com>
2016-04-15 15:24 ` Luis R. Rodriguez
[not found] ` <5710BB74.2060409@citrix.com>
2016-04-15 15:30 ` Luis R. Rodriguez
[not found] ` <20160415153028.GX1990@wotan.suse.de>
2016-04-15 16:03 ` George Dunlap
[not found] ` <571110BB.2000408@citrix.com>
2016-04-15 17:17 ` Luis R. Rodriguez
[not found] ` <5704D978.1050101@citrix.com>
2016-04-08 20:40 ` Luis R. Rodriguez
[not found] ` <20160408204032.GR1990@wotan.suse.de>
2016-04-11 5:12 ` Juergen Gross
[not found] ` <570B3228.90400@suse.com>
2016-04-12 21:02 ` Andy Lutomirski
[not found] ` <CALCETrXvGR3XKJf5Ab_ZPc-iuNuzR8AzLpRBciemKz4r0vSrGA@mail.gmail.com>
2016-04-13 9:02 ` Roger Pau Monné
[not found] ` <20160413090202.bg2vfdl3iol7eedv@mac>
2016-04-13 10:15 ` Matt Fleming
[not found] ` <20160413101515.GJ2829@codeblueprint.co.uk>
2016-04-13 10:40 ` Matt Fleming
2016-04-13 11:12 ` George Dunlap
2016-04-13 11:59 ` Roger Pau Monné
[not found] ` <20160413115846.hyt4lg24rfkenbxu@mac>
2016-04-15 22:53 ` Matt Fleming
2016-04-13 18:29 ` Luis R. Rodriguez
[not found] ` <20160413182951.GW1990@wotan.suse.de>
2016-04-13 18:56 ` Konrad Rzeszutek Wilk [this message]
2016-04-13 20:40 ` Luis R. Rodriguez
[not found] ` <20160413204055.GD1990@wotan.suse.de>
2016-04-13 21:08 ` Konrad Rzeszutek Wilk
2016-04-13 22:23 ` Luis R. Rodriguez
[not found] ` <20160413222317.GH1990@wotan.suse.de>
2016-04-14 1:01 ` Konrad Rzeszutek Wilk
[not found] ` <20160414010131.GA21510@localhost.localdomain>
2016-04-14 18:40 ` Luis R. Rodriguez
[not found] ` <20160414184048.GM1990@wotan.suse.de>
2016-04-14 19:56 ` Konrad Rzeszutek Wilk
2016-04-14 20:56 ` Luis R. Rodriguez
[not found] ` <20160414205619.GR1990@wotan.suse.de>
2016-04-15 2:02 ` Konrad Rzeszutek Wilk
2016-04-15 10:06 ` Julien Grall
[not found] ` <5710BD0B.2070306@arm.com>
2016-04-15 14:55 ` Luis R. Rodriguez
[not found] ` <CAB=NE6UDuLOnW8xfTcgCGSbJ1aS4TkkokcGdeJGHMBps0T9=Sg@mail.gmail.com>
2016-04-15 18:44 ` Stefano Stabellini
[not found] ` <20160415020246.GA6956@localhost.localdomain>
2016-04-15 17:08 ` Luis R. Rodriguez
[not found] ` <20160406111130.GG3489@olila.local.net-space.pl>
2016-04-07 19:12 ` Luis R. Rodriguez
2016-04-09 17:02 ` Luis R. Rodriguez
2016-04-06 2:40 Luis R. Rodriguez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160413185629.GA7501@char.us.oracle.com \
--to=konrad.wilk@oracle.com \
--cc=GLin@suse.com \
--cc=JBeulich@suse.com \
--cc=JCheung@suse.com \
--cc=MChang@suse.com \
--cc=andrew.cooper3@citrix.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=carnold@suse.com \
--cc=daniel.kiper@oracle.com \
--cc=david.vrabel@citrix.com \
--cc=hpa@zytor.com \
--cc=jfehlig@suse.com \
--cc=jgross@suse.com \
--cc=jlee@suse.com \
--cc=julien.grall@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=matt@codeblueprint.co.uk \
--cc=mcgrof@kernel.org \
--cc=stefano.stabellini@eu.citrix.com \
--cc=torvalds@linux-foundation.org \
--cc=vojtech@suse.cz \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).