All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: "Matt Fleming" <matt@codeblueprint.co.uk>,
	"Michael Chang" <MChang@suse.com>,
	linux-kernel@vger.kernel.org,
	"Julien Grall" <julien.grall@arm.com>,
	"Jan Beulich" <JBeulich@suse.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Daniel Kiper" <daniel.kiper@oracle.com>,
	x86@kernel.org, "Vojtěch Pavlík" <vojtech@suse.cz>,
	"Gary Lin" <GLin@suse.com>,
	xen-devel@lists.xenproject.org,
	"Jeffrey Cheung" <JCheung@suse.com>,
	"Charles Arndol" <carnold@suse.com>,
	"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>,
	joeyli <jlee@suse.com>, "Borislav Petkov" <bp@alien8.de>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Juergen Gross" <jgross@suse.com>,
	"Andrew Cooper" <andrew.cooper3@citrix.com>,
	"Jim Fehlig" <jfehlig@suse.com>,
	"Andy Lutomirski" <luto@amacapital.net>,
	"David Vrabel" <david.vrabel@citrix.com>,
	"Linus Torvalds" <torvalds@linux-foundation.org>
Subject: Re: HVMLite / PVHv2 - using x86 EFI boot entry
Date: Wed, 13 Apr 2016 17:08:01 -0400	[thread overview]
Message-ID: <20160413210801.GC5962@char.us.oracle.com> (raw)
In-Reply-To: <20160413204055.GD1990@wotan.suse.de>

On Wed, Apr 13, 2016 at 10:40:55PM +0200, Luis R. Rodriguez wrote:
> On Wed, Apr 13, 2016 at 02:56:29PM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Apr 13, 2016 at 08:29:51PM +0200, Luis R. Rodriguez wrote:
> > > On Mon, Apr 11, 2016 at 07:12:08AM +0200, Juergen Gross wrote:
> > > 
> > > > What would be gained by using the same entry but having two different boot
> > > > paths after it?
> > > 
> > > Its a good question. In summary for me it would be the push for sharing more
> > > code and the push for semantics on early boot to address differences
> > > proactively, and ultimately it may enable us to help bring closer the old PV
> > > boot path closer.
> > 
> > But why? We want to kill PV (eventually).
> 
> Yeah yeah, but its still there, and we'll have to live with it for
> at least minimum 5 years I hear. Part of my interest is to see to it
> that this path gets less disruption and issues, and we also address
> dead code issues which pvops simply folded under the rug. The dead code
> concerns may exist still for hvmlite, so unless someone is willing
> to make a bold claim there is none, its something to consider.

What is this dead code you speak of? Is it MTRR? Is early path code
that PV misses (like KASL or other?)


The entrace point in Linux "proper" is startup_32 or startup_64 - the same
path that EFI uses.

If you were to draw this (very simplified):

a)- GRUB2 ---------------------\ (creates an bootparam structure)
                                \
                                 +---- startup_32 or startup_64
b) EFI -> Linux EFI stub -------/
       (creates bootparm)      /
c) GRUB2-EFI  -> Linux EFI----/
               stub         /
d) HVMLite ----------------/
      (creates bootparm)

(I am not sure about the c) - I would have to look in source to
be source). There is also LILO in this, but I am not even sure if
works anymore.


What you have is that every entry point creates the bootparams
and ends up calling startup_X. The startup_64 then hit the rest
of the kernel. The startp_X code is the one that would setup
the basic pagetables, segments, etc.

> 
> How we address semantics then is *very* important to me.

Which semantics? How the CPU is going to be at startup_X ? Or
how the CPU is going to be when EFI firmware invokes the EFI stub?
Or when GRUB2 loads Linux?

That (those bootloaders) is clearly defined. The URL I provided
mentions the HVMLite one. The Documentation/x86/boot.c mentions
what the semantics are to expected when providing an bootstrap
(which is what HVMLitel stub code in Linux would write against -
and what EFI stub code had been written against too).
> 
> > > I'll elaborate on this but first let's clarify why a new entry is used for
> > > HVMlite to start of with:
> > > 
> > >   1) Xen ABI has historically not wanted to set up the boot params for Linux
> > >      guests, instead it insists on letting the Linux kernel Xen boot stubs fill
> > >      that out for it. This sticking point means it has implicated a boot stub.
> > 
> > 
> > Which is b/c it has to be OS agnostic. It has nothing to do 'not wanting'.
> 
> It can still be OS agnostic and pass on type and custom data pointer.

Sure. It has that (it MUST otherwise how else would you pass data).
It is documented as well http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,xen.h.html#incontents_startofday
(see " Start of day structure passed to PVH guests in %ebx.")

> 
> Would that be reasonable ?
> 
> > >      The HVMLite boot entry tries to bring the boot entries paths closer as it
> > >      leverages more of the HVM boot path philosophy to mimic the regular PC boot
> > >      path.
> > > 
> > >      Is HVMLite supposed to support legacy PV guests as well BTW ?
> > 
> > Gosh no.
> 
> Interesting.. and *everyone* is happy about this?

The Xen Linux _and_ x86 maintainers are.
And the Xen community developers as well (I hadn't heard anybody screaming NOOO
so I am presuming so).

> 
> > >      Reason I'm highlighting Xen ABI as a *reason* alone is that even with
> > >      today's large discrepancy on the old PV boot path I believe we can
> > >      bring together the boot paths closer together if the Xen ABI was slightly
> > >      flexible about this, I've highlighted how I believe that is possible before,
> > 
> > <runs away screaming>
> 
> Everyone has. If you need to support old PV guests for more than 5 years the
> work I'm doing should help with that. I'm trying to leverage gains of the
> work I'm doing for HVMLite, and part of this is trying to address semantics
> proactively.

What do you mean by 'support'? Support an old kernel or support upstream Linux?

> 
> > >      *iff* the Xen ABI would at the very least set 2 things only:
> > > 
> > >      a) Hypervisor type
> > >      b) A custom data pointer
> > > 
> > >      This would enable a single boot entry on the guest to handle then:
> > > 
> > > 	Pseudo code:
> > > 
> > > 	startup_32()                         startup_64()
> > > 	       |                                  |
> > > 	       |                                  |
> > > 	       V                                  V
> > > 	pre_hypervisor_stub_32()        pre_hypervisor_stub_64()
> > > 	       |                                  |
> > > 	       |                                  |
> > > 	       V                                  V
> > > 	 [existing startup_32()]       [existing startup_64()]
> > > 	       |                                  |
> > > 	       |                                  |
> > > 	       V                                  V
> > > 	post_hypervisor_stub_32()       post_hypervisor_stub_64()
> > > 
> > >      
> > >      If the Xen ABI was flexible about setting a hypervisor type and custom
> > >      data pointer then we would haven handlers for it, and in it, it can
> > >      do whatever it thinks is needed for its own guest types. It could
> > >      also continue to set the zero page on its own as it sees fit.
> > > 
> > >      Again, note that if this is done it could also mean even bringing together
> > >      the old PV boot path closer together... so this is not just a prospect
> > >      for HVMLite but also for old PV guests.
> > > 
> > >   2) Because of 1) it has meant we have no formal semantics for early boot
> > >      code is available and so severe differences can best be addressed also
> > >      by yet another boot entry. This has meant often times not addressing
> > 
> > There are semantics written for this new code: http://xenbits.xen.org/docs/unstable/misc/hvmlite.html
> 
> That only addressed semantics for early boot code implicitly through a new entry...

And there is the Documentation/x86/boot.txt.

You have two semantics from either side clearly defined. Now it is just
the matter of connecting the dots.

> 
> > All other ones related to low-level operations are described in Intel SDM.
> > 
> > 
> > >      or not knowing if we've addressed real differences between the different
> > >      entries. Case in point, dead code [0]. How do we know we will not run
> > >      certain code that should not run for the different entries ? Without
> > >      *any* semantics later in boot code to distinguish where we came from
> > >      and because we strive to build single kernels with different possible
> > >      run time environments it means we have tons of code available to
> > >      execute / run that we may not need.
> > 
> > I am not following that. PVH aka HVMLite will pretty much erase the need for the
> > pvops.
> 
> It does not mean there are no dead code concerns with HVMlite.

I am pretty sure there are none. But I need to make sure I understand
what you mean by 'dead code'.

> 
> > > 
> > >      Because of the lack of semantics we may still have dead code prospects
> > >      with the new HVMLite entry. How are we sure there is no differences ?
> > > 
> > > [0] http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html
> > > 
> > >   3) Unikernel / other OS requirements: this is really tied to 2) but even if
> > >      we tried to evolve the Xen ABI it would mean considering existing solutions
> > >      out there. Things to consider as an example: FreeBSD doesn't have an EFI
> > >      entry, unikernels want a simple boot entry.
> > > 
> > > With this in mind then, that I can think of:
> > > 
> > > Cons of using the same entry but having two different boot paths:
> > > 
> > >   * Pushes the Xen ABI, needs to make everyone happy, this is hard
> > >   * Perhaps harder to implement
> > > 
> > > Gains of striving to use the same entry but having two different boot:
> > > 
> > >  * Helps to share more code easily
> > >  * Reduce attack surface
> > >  * Requires us to have semantics for early boot; this has a series of
> > >    side benefits:
> > >    - Means you should try to address differences explicitly rather than
> > >      implicitly -- case in point Dead Code
> > > 
> > > > You still need a way to distinguish between bare metal
> > > > EFI and HVMlite.
> > > 
> > > Great point! This is the semantics aspect. The new entry for HVMlite approach
> > > deals with this by making the differences implicit by the new entry point.
> > > My call for addressing this through a hypervisor type was to see if we can
> > > get those semantics added explicitly so we can also later address dead
> > > code concerns for the new HVMLite guest type.
> > 
> > Right, they are..
> 
> There is huge merit to address a huge chunks of dead code concerns by sticking
> more closer to the native booth paths, it doesn't mean you still have no

Right, which we do. Keep in mind that Linux does not boot by itself. It needs
a bootloader which sets the stage for it. We set the same exact stage.

> dead code concerns with HVMlite, nor that HVMLite has no platform quirks,
> it does and part of some recent work is to pave a *clean* path for setting
> these differences apart.

/me scratches his head.

There will always be platform quirks.

I guess I am not understanding your concerns. The work that Boris is doing is
to code against the bootparams - which has a spec.

> 
> > > Part of my own interest in an EFI entry here is that EFI could be used to help
> > > expand on the semantics in an OS/agnostic form rather than pushing the x86 boot
> > > protocol further. That seems to have its own set of drawbacks though.
> > > 
> > > 
> > > > And Xen needs a way to find out whether a kernel is
> > > > supporting HVMlite to boot it in the correct mode.
> > > 
> > > How was Xen going to find out if new kernels had HVMlite support with the
> > > new entry ? An ELFNOTE() ? If an entry is shared could we note use an
> > 
> > Yeah.
> > > ELFNOTE() also for this though too ?
> > 
> > Not sure what you mean by 'shared'. But you can add multiple Elf PT_NOTEs.
> > See the ELF document.
> 
> OK so even if we used a common/shared entry point we can address letting
> Xen find out whether or not a kernel supports HVMlite.

Yes. Xen parses the Linux ELF NOTEs and can figure out if the kernel
can do HVMLite or not.

> 
>   Luis

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-04-13 21:08 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-06  2:40 HVMLite / PVHv2 - using x86 EFI boot entry Luis R. Rodriguez
2016-04-06  9:40 ` David Vrabel
2016-04-06  9:40 ` David Vrabel
2016-04-08 20:40   ` Luis R. Rodriguez
2016-04-08 20:40   ` Luis R. Rodriguez
2016-04-11  5:12     ` Juergen Gross
2016-04-11  5:12     ` Juergen Gross
2016-04-12 21:02       ` Andy Lutomirski
2016-04-13  9:02         ` Roger Pau Monné
2016-04-13 10:15           ` Matt Fleming
2016-04-13 10:15           ` Matt Fleming
2016-04-13 10:40             ` Matt Fleming
2016-04-13 10:40             ` Matt Fleming
2016-04-13 11:12             ` George Dunlap
2016-04-13 11:12             ` [Xen-devel] " George Dunlap
2016-04-13 11:59             ` Roger Pau Monné
2016-04-15 22:53               ` Matt Fleming
2016-04-15 22:53               ` Matt Fleming
2016-04-13 11:59             ` Roger Pau Monné
2016-04-13  9:02         ` Roger Pau Monné
2016-04-12 21:02       ` Andy Lutomirski
2016-04-13 18:29       ` Luis R. Rodriguez
2016-04-13 18:29       ` Luis R. Rodriguez
2016-04-13 18:56         ` Konrad Rzeszutek Wilk
2016-04-13 20:40           ` Luis R. Rodriguez
2016-04-13 20:40           ` [Xen-devel] " Luis R. Rodriguez
2016-04-13 21:08             ` Konrad Rzeszutek Wilk [this message]
2016-04-13 22:23               ` Luis R. Rodriguez
2016-04-14  1:01                 ` Konrad Rzeszutek Wilk
2016-04-14 18:40                   ` Luis R. Rodriguez
2016-04-14 18:40                   ` [Xen-devel] " Luis R. Rodriguez
2016-04-14 19:56                     ` Konrad Rzeszutek Wilk
2016-04-14 19:56                       ` Konrad Rzeszutek Wilk
2016-04-14 20:56                       ` [Xen-devel] " Luis R. Rodriguez
2016-04-15  2:02                         ` Konrad Rzeszutek Wilk
2016-04-15  2:02                         ` [Xen-devel] " Konrad Rzeszutek Wilk
2016-04-15 17:08                           ` Luis R. Rodriguez
2016-04-15 17:08                           ` [Xen-devel] " Luis R. Rodriguez
2016-04-15 10:06                         ` Julien Grall
2016-04-15 10:06                         ` [Xen-devel] " Julien Grall
2016-04-15 14:55                           ` Luis R. Rodriguez
2016-04-15 18:44                             ` Stefano Stabellini
2016-04-15 18:44                             ` [Xen-devel] " Stefano Stabellini
2016-04-15 14:55                           ` Luis R. Rodriguez
2016-04-14 20:56                       ` Luis R. Rodriguez
2016-04-14  1:01                 ` Konrad Rzeszutek Wilk
2016-04-13 22:23               ` Luis R. Rodriguez
2016-04-06 11:07 ` [Xen-devel] " George Dunlap
2016-04-06 15:02   ` Matt Fleming
2016-04-06 16:05     ` Konrad Rzeszutek Wilk
2016-04-06 16:23       ` Konrad Rzeszutek Wilk
2016-04-08 21:53         ` [Xen-devel] " Luis R. Rodriguez
2016-04-08 21:53         ` Luis R. Rodriguez
2016-04-13 10:03     ` Roger Pau Monné
2016-04-13 10:03     ` [Xen-devel] " Roger Pau Monné
2016-04-13 10:21       ` Matt Fleming
2016-04-13 10:21       ` Matt Fleming
2016-04-06 15:02   ` Matt Fleming
2016-04-07 18:51   ` [Xen-devel] " Luis R. Rodriguez
2016-04-08 14:16     ` George Dunlap
2016-04-08 21:58       ` Luis R. Rodriguez
2016-04-12 22:12         ` Luis R. Rodriguez
2016-04-13 10:05           ` George Dunlap
2016-04-13 18:54             ` Luis R. Rodriguez
2016-04-14  9:42               ` George Dunlap
2016-04-14 19:59                 ` Luis R. Rodriguez
2016-04-14 19:59                 ` [Xen-devel] " Luis R. Rodriguez
2016-04-14  9:42               ` George Dunlap
2016-04-13 18:54             ` Luis R. Rodriguez
2016-04-13 10:05           ` George Dunlap
2016-04-13 10:25           ` Roger Pau Monné
2016-04-13 10:25           ` [Xen-devel] " Roger Pau Monné
2016-04-13 19:10             ` Luis R. Rodriguez
2016-04-13 19:10             ` Luis R. Rodriguez
2016-04-12 22:12         ` Luis R. Rodriguez
2016-04-13  9:54         ` [Xen-devel] " Roger Pau Monné
2016-04-13 18:50           ` Luis R. Rodriguez
2016-04-13 18:50           ` [Xen-devel] " Luis R. Rodriguez
2016-04-13 19:02             ` Konrad Rzeszutek Wilk
2016-04-13 19:14               ` [Xen-devel] " Luis R. Rodriguez
2016-04-13 19:22                 ` Konrad Rzeszutek Wilk
2016-04-13 20:01                   ` Luis R. Rodriguez
2016-04-13 20:01                   ` [Xen-devel] " Luis R. Rodriguez
2016-04-13 20:11                     ` Konrad Rzeszutek Wilk
2016-04-13 20:35                       ` [Xen-devel] " Luis R. Rodriguez
2016-04-13 20:48                         ` Konrad Rzeszutek Wilk
2016-04-13 20:35                       ` Luis R. Rodriguez
2016-04-14 10:13                 ` George Dunlap
2016-04-14 10:13                 ` [Xen-devel] " George Dunlap
2016-04-13 19:14               ` Luis R. Rodriguez
2016-04-13  9:54         ` Roger Pau Monné
2016-04-08 21:58       ` Luis R. Rodriguez
2016-04-08 14:16     ` George Dunlap
2016-04-13 15:44     ` [Xen-devel] " George Dunlap
2016-04-13 19:52       ` Luis R. Rodriguez
2016-04-13 19:52       ` [Xen-devel] " Luis R. Rodriguez
2016-04-14  9:53         ` George Dunlap
2016-04-14  9:53         ` [Xen-devel] " George Dunlap
2016-04-14 19:44           ` Luis R. Rodriguez
2016-04-14 20:38             ` Konrad Rzeszutek Wilk
2016-04-14 21:12               ` Luis R. Rodriguez
2016-04-14 21:12               ` [Xen-devel] " Luis R. Rodriguez
2016-04-15  2:14                 ` Konrad Rzeszutek Wilk
2016-04-15  2:14                   ` Konrad Rzeszutek Wilk
2016-04-14 20:38             ` Konrad Rzeszutek Wilk
2016-04-15  5:50             ` [Xen-devel] " Juergen Gross
2016-04-15 15:24               ` Luis R. Rodriguez
2016-04-15 15:24               ` [Xen-devel] " Luis R. Rodriguez
2016-04-15  5:50             ` Juergen Gross
2016-04-15  9:59             ` George Dunlap
2016-04-15  9:59             ` [Xen-devel] " George Dunlap
2016-04-15 15:30               ` Luis R. Rodriguez
2016-04-15 15:30               ` [Xen-devel] " Luis R. Rodriguez
2016-04-15 16:03                 ` George Dunlap
2016-04-15 16:03                 ` [Xen-devel] " George Dunlap
2016-04-15 17:17                   ` Luis R. Rodriguez
2016-04-15 17:17                   ` [Xen-devel] " Luis R. Rodriguez
2016-04-14 19:44           ` Luis R. Rodriguez
2016-04-13 15:44     ` George Dunlap
2016-04-07 18:51   ` Luis R. Rodriguez
2016-04-06 11:07 ` George Dunlap
2016-04-06 11:11 ` Daniel Kiper
2016-04-07 19:12   ` Luis R. Rodriguez
2016-04-07 19:12   ` Luis R. Rodriguez
2016-04-09 17:02   ` Luis R. Rodriguez
2016-04-09 17:02   ` Luis R. Rodriguez
2016-04-06 11:11 ` Daniel Kiper
  -- strict thread matches above, loose matches on Subject: below --
2016-04-06  2:40 Luis R. Rodriguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160413210801.GC5962@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=GLin@suse.com \
    --cc=JBeulich@suse.com \
    --cc=JCheung@suse.com \
    --cc=MChang@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=carnold@suse.com \
    --cc=daniel.kiper@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=hpa@zytor.com \
    --cc=jfehlig@suse.com \
    --cc=jgross@suse.com \
    --cc=jlee@suse.com \
    --cc=julien.grall@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=matt@codeblueprint.co.uk \
    --cc=mcgrof@kernel.org \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vojtech@suse.cz \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.