From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v2 02/11] xen/hvmlite: Bootstrap HVMlite guest Date: Wed, 3 Feb 2016 15:11:56 -0500 Message-ID: <56B25F0C.2050808__28191.0553960644$1454530411$gmane$org@oracle.com> References: <1454341137-14110-1-git-send-email-boris.ostrovsky@oracle.com> <1454341137-14110-3-git-send-email-boris.ostrovsky@oracle.com> <20160203185525.GV20964@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aR3mT-0003Ck-AX for xen-devel@lists.xenproject.org; Wed, 03 Feb 2016 20:12:01 +0000 In-Reply-To: <20160203185525.GV20964@wotan.suse.de> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Luis R. Rodriguez" Cc: pmonclus@plumgrid.com, GLin@suse.coma, mcgrof@suse.com, x86@kernel.org, linux-kernel@vger.kernel.org, David Vrabel , hpa@zytor.com, xen-devel@lists.xenproject.org, bp@suse.de, bblanco@plumgrid.com, roger.pau@citrix.com List-Id: xen-devel@lists.xenproject.org On 02/03/2016 01:55 PM, Luis R. Rodriguez wrote: > I saw no considerations for the recommendations I had made last on your v1: > > https://lkml.kernel.org/r/CAB=NE6XPA0YzbnM8=rspkKai6d3GkXXO00Gr0VZUYoyzNy6thw@mail.gmail.com > > Of importance: > > 1) Using pv_info.paravirt_enabled = 1 is wrong unless you mean to say this > is for legacy x86: > > Your patch #3 keeps on setting pv_info.paravirt_enabled = 1 and as discussed > this is wrong. It will be renamed to x86_legacy_free() to align with what folks > are pushing for a BIOS flag to annotate if a system requires legacy x86 stuff. > This also means re-thinking all use cases and ensuring subarch is used then > instead when the goal was to avoid Xen from entering that code. Today Xen does > not use this but with my work it does and it helps clean and brush up a lot of > these checks with future prospects to even help unify entry points. As I said earlier, I am not sure I understand what subarch buys us for HVMlite guests. As for using paravirt_enabled -- this is really only used to differentiate HVM from HVMlite and I think (although I'd need to check) is only needed by Xen-specific code in a couple of places. So if/when it is removed we will switch to something else. Since your work is WIP I decided to keep using it until it's clear what other options may be available. > > 2) We should avoid more hypervisor type hacks, and just consider a new > hypervisor type to close the gap: > > Using x86_legacy_free() and friends in a unified way for all systems means it > should only be used after init_hypervisor_platform() which is called during > setup_arch(). This means we have a semantic gap for checks on "are we on > hypervisor type and which one?". In this particular case we don't need any information about hypervisor until init_hypervisor_platform(). > There are drivers now using these sorts of > checks as well, for instance snd_intel8x0_inside_vm(). We should avoid having > these hacks but that also means cleaning up a well define grammar here for what > we want. I'm doing work to help with this by streamlining use of the subarch > type, that should help with PV code, but your use case seems different but yet > related, what I had suggested last was to consider we add a new hypervisor type > to the x86 boot protocol which would be available early on. This would have a > few purposes, one of which deserves its own section below on dead code: > > a) clean up hacks as with snd_intel8x0_inside_vm() > b) enable a generic way and clean way to distinguish what hypervisor > type you're on > c) since it would be set early and if we can ensure its accessible > early on boot it would mean avoiding having to add yet-another > asm entry point for Linux, you could just use startup_32() and > the hypervisor type could easily just have an early branch call > and post branch call very similar to how we deal with the subarch > currently on 32-bit. Your calls then just become early stubs and > we'd have a solution for other PV types that want a similar solution > later Which calls? If you are referring to xen_prepare_hvmlite/hvmlite_bootparams then these are needed to prepare boot_params. And we should not enter startup_32() without them ready. > > 3) Dead code concerns and unifying entry points: > > Addressing the semantics for the gray areas I am highlighting are critical for > ensuring one does not run code or even exposes code as a available for the type > of run time system booted, some folks call this "dead code". This is critical > for Linux distributions which need to rely on the flexibility of having one > kernel work for different use cases. The resolution to this problem was pvops > but pvops has shortcoming for dead code, it didn't address the problem likely as > it was not considered serious. It also didn't address the issue of different > hypervisors wanting different entry points and that this fact alone also contributes > to more dead code concerns, case in point the regressions introduced by cr4 shadow > and the latest one is Kasan which to this day breaks Xen! Dead code topics are > not easy to grasp, its why I've started on my own crusade to talk to people and > write about it [0], and as of late propose some changes to avoid these in a > clean way without extending pvops. Adding yet another entry point will not help > here *specially* if we do not take semantics seriously over the different hypervisors > and hypervisor types. > > [0] http://www.do-not-panic.com/2015/12/avoiding-dead-code-pvops-not-silver-bullet.html > [1] http://www.do-not-panic.com/2015/12/xen-and-x86-linux-zero-page.html I don't understand what this has to do with HVMlite guests. > > My recommendation then: > > We add new hypervisor type to close the semantic gap for hypervisor types, and > much like subarch enable also a subarch_data to let you pass and use your > hvmlite_start_info. This would not only help with the semantics but also help > avoid yet-another-entry point and force us to provide a well define structure > for considering code that should not run by pegging it as required or supported > for different early x86 code stubs. As I said before, I don't see how we can avoid having another entry point without making Xen change its load procedure. Which is highly unlikely to happen. > > I had hinted perhaps we might be able to piggy back on top of the ELF loader > protocol as well, and since that's standard do wonder if that could instead > be extended to help unify a mechanism for different OSes instead of making > this just a solution for Linux. > > Code review below. > > On Mon, Feb 01, 2016 at 10:38:48AM -0500, Boris Ostrovsky wrote: >> Start HVMlite guest at XEN_ELFNOTE_PHYS32_ENTRY address. Setup hypercall >> page, initialize boot_params, enable early page tables. >> >> Since this stub is executed before kernel entry point we cannot use >> variables in .bss which is cleared by kernel. We explicitly place >> variables that are initialized here into .data. >> >> Signed-off-by: Boris Ostrovsky >> --- >> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c >> index 5774800..5f05fa2 100644 >> --- a/arch/x86/xen/enlighten.c >> +++ b/arch/x86/xen/enlighten.c >> @@ -171,6 +172,17 @@ struct tls_descs { >> */ >> static DEFINE_PER_CPU(struct tls_descs, shadow_tls_desc); >> >> +#ifdef CONFIG_XEN_PVHVM >> +/* >> + * HVMlite variables. These need to live in data segment since they are >> + * initialized before startup_{32|64}, which clear .bss, are invoked. >> + */ >> +int xen_hvmlite __attribute__((section(".data"))) = 0; >> +struct hvm_start_info hvmlite_start_info __attribute__((section(".data"))); >> +uint hvmlite_start_info_sz = sizeof(hvmlite_start_info); >> +struct boot_params xen_hvmlite_boot_params __attribute__((section(".data"))); >> +#endif >> + > The section annotations seems very special use case but likely worth documenting > and defining a new macro for in include/linux/compiler.h. This would make it > easier to change should we want to change the section used here later and > enable others to easily look for the reason for these annotations in a > single place. I wonder whether __initdata would be a good attribute. We only need this early in the boot. And xen_hvmlite is gone now so we don't need to worry about it. -boris