From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752855AbcDMWXa (ORCPT ); Wed, 13 Apr 2016 18:23:30 -0400 Received: from mx2.suse.de ([195.135.220.15]:50828 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750990AbcDMWX3 (ORCPT ); Wed, 13 Apr 2016 18:23:29 -0400 Date: Thu, 14 Apr 2016 00:23:17 +0200 From: "Luis R. Rodriguez" To: Konrad Rzeszutek Wilk Cc: "Luis R. Rodriguez" , Juergen Gross , Matt Fleming , Michael Chang , linux-kernel@vger.kernel.org, Jim Fehlig , Jan Beulich , "H. Peter Anvin" , Daniel Kiper , x86@kernel.org, =?utf-8?Q?Vojt=C4=9Bch_Pavl=C3=ADk?= , Gary Lin , xen-devel@lists.xenproject.org, Jeffrey Cheung , Stefano Stabellini , joeyli , Borislav Petkov , Boris Ostrovsky , Charles Arndol , Andrew Cooper , Julien Grall , Andy Lutomirski , David Vrabel , Linus Torvalds , Roger Pau =?iso-8859-1?Q?Monn=E9?= Subject: Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry Message-ID: <20160413222317.GH1990@wotan.suse.de> References: <20160406024027.GX1990@wotan.suse.de> <5704D978.1050101@citrix.com> <20160408204032.GR1990@wotan.suse.de> <570B3228.90400@suse.com> <20160413182951.GW1990@wotan.suse.de> <20160413185629.GA7501@char.us.oracle.com> <20160413204055.GD1990@wotan.suse.de> <20160413210801.GC5962@char.us.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160413210801.GC5962@char.us.oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 13, 2016 at 05:08:01PM -0400, Konrad Rzeszutek Wilk wrote: > On Wed, Apr 13, 2016 at 10:40:55PM +0200, Luis R. Rodriguez wrote: > > On Wed, Apr 13, 2016 at 02:56:29PM -0400, Konrad Rzeszutek Wilk wrote: > > > On Wed, Apr 13, 2016 at 08:29:51PM +0200, Luis R. Rodriguez wrote: > > > > On Mon, Apr 11, 2016 at 07:12:08AM +0200, Juergen Gross wrote: > > > > > > > > > What would be gained by using the same entry but having two different boot > > > > > paths after it? > > > > > > > > Its a good question. In summary for me it would be the push for sharing more > > > > code and the push for semantics on early boot to address differences > > > > proactively, and ultimately it may enable us to help bring closer the old PV > > > > boot path closer. > > > > > > But why? We want to kill PV (eventually). > > > > Yeah yeah, but its still there, and we'll have to live with it for > > at least minimum 5 years I hear. Part of my interest is to see to it > > that this path gets less disruption and issues, and we also address > > dead code issues which pvops simply folded under the rug. The dead code > > concerns may exist still for hvmlite, so unless someone is willing > > to make a bold claim there is none, its something to consider. > > What is this dead code you speak of? Is it MTRR? Is early path code > that PV misses (like KASL or other?) Kasan is dead code to Xen. If you boot x86 Xen with Kasan enabled Xen explodes. Quick question, will Kasan not explode with HVMLite ? MTRR used to be dead code concern but since we have vetted most of that code now we are pretty certain that code should never run now. KASLR may be -- not sure as I haven't vetted that, but from what I have loosely heard maybe. VGA code will be dead code for HVMlite for sure as the design doc says it will not run VGA, the ACPI flag will be set but the check for that is not yet on Linux. That means the VGA Linux code will be there but we have no way to ensure it will not run nor that anything will muck with it. To be clear -- dead code concerns still exist even without virtualization solutions, its just that with virtualization this stuff comes up more and there has been no proactive measures to address this. The question of semantics here is to see to what extent we need earlier boot code annotations to ensure we address semantics proactively. > The entrace point in Linux "proper" is startup_32 or startup_64 - the same > path that EFI uses. > > If you were to draw this (very simplified): > > a)- GRUB2 ---------------------\ (creates an bootparam structure) > \ > +---- startup_32 or startup_64 > b) EFI -> Linux EFI stub -------/ > (creates bootparm) / > c) GRUB2-EFI -> Linux EFI----/ > stub / > d) HVMLite ----------------/ > (creates bootparm) b) and d) might be able to share paths there... d) still has its own entry, it does more than create boot params. > (I am not sure about the c) - I would have to look in source to > be source). There is also LILO in this, but I am not even sure if > works anymore. > > > What you have is that every entry point creates the bootparams > and ends up calling startup_X. The startup_64 then hit the rest > of the kernel. The startp_X code is the one that would setup > the basic pagetables, segments, etc. Sure.. a full diagram should include both sides and how when using a custom entry one runs the risk of skipping a lot of code setup. There is that and as others have pointed out how certain guests types are assumed to not have certain peripherals, and we have no idea to ensure certain old legacy code may not ever run or be accessed by drivers. > > How we address semantics then is *very* important to me. > > Which semantics? How the CPU is going to be at startup_X ? Or > how the CPU is going to be when EFI firmware invokes the EFI stub? > Or when GRUB2 loads Linux? What hypervisor kicked me and what guest type I am. Let me elaborate more below. > That (those bootloaders) is clearly defined. The URL I provided > mentions the HVMLite one. The Documentation/x86/boot.c mentions > what the semantics are to expected when providing an bootstrap > (which is what HVMLitel stub code in Linux would write against - > and what EFI stub code had been written against too). > > > > > > I'll elaborate on this but first let's clarify why a new entry is used for > > > > HVMlite to start of with: > > > > > > > > 1) Xen ABI has historically not wanted to set up the boot params for Linux > > > > guests, instead it insists on letting the Linux kernel Xen boot stubs fill > > > > that out for it. This sticking point means it has implicated a boot stub. > > > > > > > > > Which is b/c it has to be OS agnostic. It has nothing to do 'not wanting'. > > > > It can still be OS agnostic and pass on type and custom data pointer. > > Sure. It has that (it MUST otherwise how else would you pass data). > It is documented as well http://xenbits.xen.org/docs/unstable/hypercall/x86_64/include,public,xen.h.html#incontents_startofday > (see " Start of day structure passed to PVH guests in %ebx.") The design doc begs for a custom OS entry point though. If we had a single 'type' and 'custom data' passed to the kernel that should suffice for the default Linux entry point to just pivot off of that and do what it needs without more entry points. Once. Luis