From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752564AbcDOCQ0 (ORCPT ); Thu, 14 Apr 2016 22:16:26 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:43335 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751775AbcDOCQZ (ORCPT ); Thu, 14 Apr 2016 22:16:25 -0400 Date: Thu, 14 Apr 2016 22:14:22 -0400 From: Konrad Rzeszutek Wilk To: "Luis R. Rodriguez" Cc: George Dunlap , Matt Fleming , jeffm@suse.com, Linux Kernel Mailing List , Jim Fehlig , Jan Beulich , "H. Peter Anvin" , Daniel Kiper , the arch/x86 maintainers , Takashi Iwai , =?utf-8?Q?Vojt=C4=9Bch_Pavl=C3=ADk?= , Gary Lin , xen-devel , Jeffrey Cheung , Charles Arndol , Julien Grall , Stefano Stabellini , joeyli , Borislav Petkov , Boris Ostrovsky , Juergen Gross , Andrew Cooper , Michael Chang , Andy Lutomirski , David Vrabel , Linus Torvalds , Roger Pau =?iso-8859-1?Q?Monn=E9?= Subject: Re: [Xen-devel] HVMLite / PVHv2 - using x86 EFI boot entry Message-ID: <20160415021422.GB6956@localhost.localdomain> References: <20160406024027.GX1990@wotan.suse.de> <20160407185148.GL1990@wotan.suse.de> <20160413195257.GB1990@wotan.suse.de> <570F68AB.2040400@citrix.com> <20160414194408.GP1990@wotan.suse.de> <20160414203847.GB21657@localhost.localdomain> <20160414211201.GS1990@wotan.suse.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160414211201.GS1990@wotan.suse.de> User-Agent: Mutt/1.5.24 (2015-08-30) X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 14, 2016 at 11:12:01PM +0200, Luis R. Rodriguez wrote: > On Thu, Apr 14, 2016 at 04:38:47PM -0400, Konrad Rzeszutek Wilk wrote: > > > This has nothing to do with dominance or anything nefarious, I'm asking > > > simply for a full engineering evaluation of all possibilities, with > > > the long term in mind. Not for now, but for hardware assumptions which > > > are sensible 5 years from now. > > > > There are two different things in my mind about this conversation: > > > > 1). semantics of low-level code wrapped around pvops. On baremetal > > it is easy - just look at Intel and AMD SDM. > > And this is exactly what running in HVM or HVMLite mode will do - > > all those low-level operations will have the same exact semantic > > as baremetal. > > Today Linux is KVM stupid for early boot code. I've pointed this out -EPARSE? > before, but again, there has been no reason found to need this. Perhaps > for HVMLite we won't need this... Are you talking about kvmtools? Which BTW are similar to how HVMLite would expose the platform. > > > There is no hope for the pv_ops to fix that. > > Actually I beg to differ. See my patches and ongoing work. I meant in terms of semantics. As in I cannot see some of those pv-ops to have the same semantics as baremetal. For example set_pte is simple on x86 (movq $, ). While on Xen PV it is a potential batching hypercall with lookup in an P2M table, then perhaps a sidelong look at the M2P, then maybe the M2P override. > > > And I am pretty sure the HVMLite in 5 years will have no > > trouble in this as it will be running in VMX mode (HVM). > > HVMLite may still use PV drivers for some things, its not super > obvious to me that low level semantics will not be needed yet. PV drivers are very different from low-level semantics. And it will have to use them. Maybe it is easier to think of this in terms of kvmtool - it is pretty much how this would work - but instead of VirtIO drivers you would be using the Xen PV drivers (thought one could also use VirtIO ones if you wanted). > > > 2). Boot entry. > > > > The semantics on Linux are well known - they are documented in > > Documentation/x86/boot.txt. > > > > HVMLite Linux guests have to somehow provide that. > > > > And how it is done seems to be tied around: > > > > a) Use existing boot paths - which means making some > > extra stub code to call in those existing boot paths > > (for example Xen could bundle with an GRUB2-alike > > code to be run when booting Linux using that boot-path). > > > > Or EFI (for a ton more code). Granted not all OSes > > support those, so not very OS agnostic. > > What other OSes do is something to consider but if they don't > do it because they are slacking in one domain should by no means > be a reason to not evaluate the long term possible gains. > Specially if we have reasons to believe more architectures will > consider it and standardize on it. > > It'd be silly not to take this a bit more seriously. Complexity vs simplicity. > > > Hard part - if the bootparams change then have to > > rev up the code in there. May be out of sync > > with Linux bootparams. > > If we are going to ultimately standardize on EFI boot for new > hardware it'd be rather silly to extend the boot params further. Whoa there... Have you spoken to hpa,tglrx about this? > > > b) Add another simpler boot entry point which has to copy > > "some" strings from its format in bootparams. > > > > > > So this part of the discussion does not fall in the > > hardware assumptions. Intel SDM or AMD mention nothing about > > boot loaders or how to boot an OS - that is all in realms > > of how software talks to software. > > Right -- so one question to ask here is what other uses are there > for this outside of say HVMLite. You mentioned Multiboot so far. > > > 3). And there is the discussion on man-power to make this > > happen. > > Sure. > > > 4). Lastly which one is simpler and involves less code so > > that there is a less chance of bitrot. > > Indeed. > > You also forgot the tie-in between dead-code and semantics but Wait, I just spoke about CPU semantics?! Which semantics are you talking about? > that clearly is not on your mind. But I'd say this is a good > summary. I put 'dead code' in the same realm as device drivers work. And they seem to always have some issue or another. Or maybe I getting unlucky and getting copied on those bugs. > > Luis From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: HVMLite / PVHv2 - using x86 EFI boot entry Date: Thu, 14 Apr 2016 22:14:22 -0400 Message-ID: <20160415021422.GB6956@localhost.localdomain> References: <20160406024027.GX1990@wotan.suse.de> <20160407185148.GL1990@wotan.suse.de> <20160413195257.GB1990@wotan.suse.de> <570F68AB.2040400@citrix.com> <20160414194408.GP1990@wotan.suse.de> <20160414203847.GB21657@localhost.localdomain> <20160414211201.GS1990@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mail6.bemta6.messagelabs.com ([85.158.143.247]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1aqtI0-0002St-6p for xen-devel@lists.xenproject.org; Fri, 15 Apr 2016 02:15:20 +0000 Content-Disposition: inline In-Reply-To: <20160414211201.GS1990@wotan.suse.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: "Luis R. Rodriguez" Cc: Matt Fleming , jeffm@suse.com, Michael Chang , Julien Grall , Jan Beulich , "H. Peter Anvin" , Daniel Kiper , the arch/x86 maintainers , Takashi Iwai , =?utf-8?Q?Vojt=C4=9Bch_Pavl=C3=ADk?= , Gary Lin , xen-devel , Jeffrey Cheung , Juergen Gross , Stefano Stabellini , Jim Fehlig , George Dunlap , joeyli , Borislav Petkov , Boris Ostrovsky , Charles Arndol , Andrew Cooper , Linux Kernel Mailing List , Andy List-Id: xen-devel@lists.xenproject.org T24gVGh1LCBBcHIgMTQsIDIwMTYgYXQgMTE6MTI6MDFQTSArMDIwMCwgTHVpcyBSLiBSb2RyaWd1 ZXogd3JvdGU6Cj4gT24gVGh1LCBBcHIgMTQsIDIwMTYgYXQgMDQ6Mzg6NDdQTSAtMDQwMCwgS29u cmFkIFJ6ZXN6dXRlayBXaWxrIHdyb3RlOgo+ID4gPiBUaGlzIGhhcyBub3RoaW5nIHRvIGRvIHdp dGggZG9taW5hbmNlIG9yIGFueXRoaW5nIG5lZmFyaW91cywgSSdtIGFza2luZwo+ID4gPiBzaW1w bHkgZm9yIGEgZnVsbCBlbmdpbmVlcmluZyBldmFsdWF0aW9uIG9mIGFsbCBwb3NzaWJpbGl0aWVz LCB3aXRoCj4gPiA+IHRoZSBsb25nIHRlcm0gaW4gbWluZC4gTm90IGZvciBub3csIGJ1dCBmb3Ig aGFyZHdhcmUgYXNzdW1wdGlvbnMgd2hpY2gKPiA+ID4gYXJlIHNlbnNpYmxlIDUgeWVhcnMgZnJv bSBub3cuCj4gPiAKPiA+IFRoZXJlIGFyZSB0d28gZGlmZmVyZW50IHRoaW5ncyBpbiBteSBtaW5k IGFib3V0IHRoaXMgY29udmVyc2F0aW9uOgo+ID4gCj4gPiAgMSkuIHNlbWFudGljcyBvZiBsb3ct bGV2ZWwgY29kZSB3cmFwcGVkIGFyb3VuZCBwdm9wcy4gT24gYmFyZW1ldGFsCj4gPiAgICBpdCBp cyBlYXN5IC0ganVzdCBsb29rIGF0IEludGVsIGFuZCBBTUQgU0RNLgo+ID4gICAgQW5kIHRoaXMg aXMgZXhhY3RseSB3aGF0IHJ1bm5pbmcgaW4gSFZNIG9yIEhWTUxpdGUgbW9kZSB3aWxsIGRvIC0K PiA+ICAgIGFsbCB0aG9zZSBsb3ctbGV2ZWwgb3BlcmF0aW9ucyB3aWxsIGhhdmUgdGhlIHNhbWUg ZXhhY3Qgc2VtYW50aWMKPiA+ICAgIGFzIGJhcmVtZXRhbC4KPiAKPiBUb2RheSBMaW51eCBpcyBL Vk0gc3R1cGlkIGZvciBlYXJseSBib290IGNvZGUuIEkndmUgcG9pbnRlZCB0aGlzIG91dAoKLUVQ QVJTRT8KPiBiZWZvcmUsIGJ1dCBhZ2FpbiwgdGhlcmUgaGFzIGJlZW4gbm8gcmVhc29uIGZvdW5k IHRvIG5lZWQgdGhpcy4gUGVyaGFwcwo+IGZvciBIVk1MaXRlIHdlIHdvbid0IG5lZWQgdGhpcy4u LgoKQXJlIHlvdSB0YWxraW5nIGFib3V0IGt2bXRvb2xzPyBXaGljaCBCVFcgYXJlIHNpbWlsYXIg dG8gaG93IEhWTUxpdGUKd291bGQgZXhwb3NlIHRoZSBwbGF0Zm9ybS4KPiAKPiA+ICAgIFRoZXJl IGlzIG5vIGhvcGUgZm9yIHRoZSBwdl9vcHMgdG8gZml4IHRoYXQuCj4gCj4gQWN0dWFsbHkgSSBi ZWcgdG8gZGlmZmVyLiBTZWUgbXkgcGF0Y2hlcyBhbmQgb25nb2luZyB3b3JrLgoKSSBtZWFudCBp biB0ZXJtcyBvZiBzZW1hbnRpY3MuIEFzIGluIEkgY2Fubm90IHNlZSBzb21lIG9mCnRob3NlIHB2 LW9wcyB0byBoYXZlIHRoZSBzYW1lIHNlbWFudGljcyBhcyBiYXJlbWV0YWwuIEZvciBleGFtcGxl CnNldF9wdGUgaXMgc2ltcGxlIG9uIHg4NiAobW92cSAkPHNvbWUgdmFsdWU+LCA8bWVtb3J5IGFk ZHJlc3M+KS4KCldoaWxlIG9uIFhlbiBQViBpdCBpcyBhIHBvdGVudGlhbCBiYXRjaGluZyBoeXBl cmNhbGwgd2l0aApsb29rdXAgaW4gYW4gUDJNIHRhYmxlLCB0aGVuIHBlcmhhcHMgYSBzaWRlbG9u ZyBsb29rIGF0CnRoZSBNMlAsIHRoZW4gbWF5YmUgdGhlIE0yUCBvdmVycmlkZS4KCj4gCj4gPiAg ICBBbmQgSSBhbSBwcmV0dHkgc3VyZSB0aGUgSFZNTGl0ZSBpbiA1IHllYXJzIHdpbGwgaGF2ZSBu bwo+ID4gICAgdHJvdWJsZSBpbiB0aGlzIGFzIGl0IHdpbGwgYmUgcnVubmluZyBpbiBWTVggbW9k ZSAoSFZNKS4KPiAKPiBIVk1MaXRlIG1heSBzdGlsbCB1c2UgUFYgZHJpdmVycyBmb3Igc29tZSB0 aGluZ3MsIGl0cyBub3Qgc3VwZXIKPiBvYnZpb3VzIHRvIG1lIHRoYXQgbG93IGxldmVsIHNlbWFu dGljcyB3aWxsIG5vdCBiZSBuZWVkZWQgeWV0LgoKUFYgZHJpdmVycyBhcmUgdmVyeSBkaWZmZXJl bnQgZnJvbSBsb3ctbGV2ZWwgc2VtYW50aWNzLgoKQW5kIGl0IHdpbGwgaGF2ZSB0byB1c2UgdGhl bS4KCk1heWJlIGl0IGlzIGVhc2llciB0byB0aGluayBvZiB0aGlzIGluIHRlcm1zIG9mIGt2bXRv b2wgLSBpdAppcyBwcmV0dHkgbXVjaCBob3cgdGhpcyB3b3VsZCB3b3JrIC0gYnV0IGluc3RlYWQg b2YgVmlydElPCmRyaXZlcnMgeW91IHdvdWxkIGJlIHVzaW5nIHRoZSBYZW4gUFYgZHJpdmVycyAo dGhvdWdodCBvbmUKY291bGQgYWxzbyB1c2UgVmlydElPIG9uZXMgaWYgeW91IHdhbnRlZCkuCj4g Cj4gPiAgMikuIEJvb3QgZW50cnkuCj4gPiAKPiA+ICAgIFRoZSBzZW1hbnRpY3Mgb24gTGludXgg YXJlIHdlbGwga25vd24gLSB0aGV5IGFyZSBkb2N1bWVudGVkIGluCj4gPiAgICBEb2N1bWVudGF0 aW9uL3g4Ni9ib290LnR4dC4KPiA+IAo+ID4gICAgSFZNTGl0ZSBMaW51eCBndWVzdHMgaGF2ZSB0 byBzb21laG93IHByb3ZpZGUgdGhhdC4KPiA+IAo+ID4gICAgQW5kIGhvdyBpdCBpcyBkb25lIHNl ZW1zIHRvIGJlIHRpZWQgYXJvdW5kOgo+ID4gCj4gPiAgICBhKSBVc2UgZXhpc3RpbmcgYm9vdCBw YXRocyAtIHdoaWNoIG1lYW5zIG1ha2luZyBzb21lCj4gPiAgICAgICBleHRyYSBzdHViIGNvZGUg dG8gY2FsbCBpbiB0aG9zZSBleGlzdGluZyBib290IHBhdGhzCj4gPiAgICAgICAoZm9yIGV4YW1w bGUgWGVuIGNvdWxkIGJ1bmRsZSB3aXRoIGFuIEdSVUIyLWFsaWtlCj4gPiAgICAgICAgY29kZSB0 byBiZSBydW4gd2hlbiBib290aW5nIExpbnV4IHVzaW5nIHRoYXQgYm9vdC1wYXRoKS4KPiA+IAo+ ID4gICAgICAgT3IgRUZJIChmb3IgYSB0b24gbW9yZSBjb2RlKS4gR3JhbnRlZCBub3QgYWxsIE9T ZXMKPiA+ICAgICAgIHN1cHBvcnQgdGhvc2UsIHNvIG5vdCB2ZXJ5IE9TIGFnbm9zdGljLgo+IAo+ IFdoYXQgb3RoZXIgT1NlcyBkbyBpcyBzb21ldGhpbmcgdG8gY29uc2lkZXIgYnV0IGlmIHRoZXkg ZG9uJ3QKPiBkbyBpdCBiZWNhdXNlIHRoZXkgYXJlIHNsYWNraW5nIGluIG9uZSBkb21haW4gc2hv dWxkIGJ5IG5vIG1lYW5zCj4gYmUgYSByZWFzb24gdG8gbm90IGV2YWx1YXRlIHRoZSBsb25nIHRl cm0gcG9zc2libGUgZ2FpbnMuCj4gU3BlY2lhbGx5IGlmIHdlIGhhdmUgcmVhc29ucyB0byBiZWxp ZXZlIG1vcmUgYXJjaGl0ZWN0dXJlcyB3aWxsCj4gY29uc2lkZXIgaXQgYW5kIHN0YW5kYXJkaXpl IG9uIGl0Lgo+IAo+IEl0J2QgYmUgc2lsbHkgbm90IHRvIHRha2UgdGhpcyBhIGJpdCBtb3JlIHNl cmlvdXNseS4KCkNvbXBsZXhpdHkgdnMgc2ltcGxpY2l0eS4KPiAKPiA+ICAgICAgICBIYXJkIHBh cnQgLSBpZiB0aGUgYm9vdHBhcmFtcyBjaGFuZ2UgdGhlbiBoYXZlIHRvCj4gPiAgICAgICByZXYg dXAgdGhlIGNvZGUgaW4gdGhlcmUuIE1heSBiZSBvdXQgb2Ygc3luYwo+ID4gICAgICAgd2l0aCBM aW51eCBib290cGFyYW1zLgo+IAo+IElmIHdlIGFyZSBnb2luZyB0byB1bHRpbWF0ZWx5IHN0YW5k YXJkaXplIG9uIEVGSSBib290IGZvciBuZXcKPiBoYXJkd2FyZSBpdCdkIGJlIHJhdGhlciBzaWxs eSB0byBleHRlbmQgdGhlIGJvb3QgcGFyYW1zIGZ1cnRoZXIuCgpXaG9hIHRoZXJlLi4uIEhhdmUg eW91IHNwb2tlbiB0byBocGEsdGdscnggYWJvdXQgdGhpcz8KCj4gCj4gPiAgICBiKSBBZGQgYW5v dGhlciBzaW1wbGVyIGJvb3QgZW50cnkgcG9pbnQgd2hpY2ggaGFzIHRvIGNvcHkKPiA+ICAgICAg InNvbWUiIHN0cmluZ3MgZnJvbSBpdHMgZm9ybWF0IGluIGJvb3RwYXJhbXMuCj4gPiAKPiA+IAo+ ID4gICAgU28gdGhpcyBwYXJ0IG9mIHRoZSBkaXNjdXNzaW9uIGRvZXMgbm90IGZhbGwgaW4gdGhl Cj4gPiAgICBoYXJkd2FyZSBhc3N1bXB0aW9ucy4gSW50ZWwgU0RNIG9yIEFNRCBtZW50aW9uIG5v dGhpbmcgYWJvdXQKPiA+ICAgIGJvb3QgbG9hZGVycyBvciBob3cgdG8gYm9vdCBhbiBPUyAtIHRo YXQgaXMgYWxsIGluIHJlYWxtcwo+ID4gICAgb2YgaG93IHNvZnR3YXJlIHRhbGtzIHRvIHNvZnR3 YXJlLgo+IAo+IFJpZ2h0IC0tIHNvIG9uZSBxdWVzdGlvbiB0byBhc2sgaGVyZSBpcyB3aGF0IG90 aGVyIHVzZXMgYXJlIHRoZXJlCj4gZm9yIHRoaXMgb3V0c2lkZSBvZiBzYXkgSFZNTGl0ZS4gWW91 IG1lbnRpb25lZCBNdWx0aWJvb3Qgc28gZmFyLgo+IAo+ID4gIDMpLiBBbmQgdGhlcmUgaXMgdGhl IGRpc2N1c3Npb24gb24gbWFuLXBvd2VyIHRvIG1ha2UgdGhpcwo+ID4gICAgaGFwcGVuLgo+IAo+ IFN1cmUuCj4gCj4gPiAgNCkuIExhc3RseSB3aGljaCBvbmUgaXMgc2ltcGxlciBhbmQgaW52b2x2 ZXMgbGVzcyBjb2RlIHNvCj4gPiAgICAgdGhhdCB0aGVyZSBpcyBhIGxlc3MgY2hhbmNlIG9mIGJp dHJvdC4KPiAKPiBJbmRlZWQuCj4gCj4gWW91IGFsc28gZm9yZ290IHRoZSB0aWUtaW4gYmV0d2Vl biBkZWFkLWNvZGUgYW5kIHNlbWFudGljcyBidXQKCldhaXQsIEkganVzdCBzcG9rZSBhYm91dCBD UFUgc2VtYW50aWNzPyEgV2hpY2ggc2VtYW50aWNzCmFyZSB5b3UgdGFsa2luZyBhYm91dD8KPiB0 aGF0IGNsZWFybHkgaXMgbm90IG9uIHlvdXIgbWluZC4gQnV0IEknZCBzYXkgdGhpcyBpcyBhIGdv b2QKPiBzdW1tYXJ5LgoKSSBwdXQgJ2RlYWQgY29kZScgaW4gdGhlIHNhbWUgcmVhbG0gYXMgZGV2 aWNlIGRyaXZlcnMgd29yay4KQW5kIHRoZXkgc2VlbSB0byBhbHdheXMgaGF2ZSBzb21lIGlzc3Vl IG9yIGFub3RoZXIuCk9yIG1heWJlIEkgZ2V0dGluZyB1bmx1Y2t5IGFuZCBnZXR0aW5nIGNvcGll ZCBvbiB0aG9zZSBidWdzLgo+IAo+ICAgTHVpcwoKX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVsIG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlz dHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9yZy94ZW4tZGV2ZWwK