From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?windows-1252?Q?Roger_Pau_Monn=E9?= Subject: Re: [PATCH v3 07/32] xen/x86: fix arch_set_info_guest for HVM guests Date: Mon, 3 Aug 2015 19:31:21 +0200 Message-ID: <55BFA569.7050102@citrix.com> References: <1435923310-9019-1-git-send-email-roger.pau@citrix.com> <1435923310-9019-8-git-send-email-roger.pau@citrix.com> <55A3E0F102000078000903B5@mail.emea.novell.com> <55B0C100.6000308@citrix.com> <55B0EC520200007800094842@prv-mh.provo.novell.com> <1437651718.19412.92.camel@citrix.com> <55B103DD.4030901@citrix.com> <55B125440200007800094BD2@prv-mh.provo.novell.com> <55B10CDB.5010708@citrix.com> <1437667259.24746.12.camel@citrix.com> <55B11316.2000201@citrix.com> <55B130420200007800094CB8@prv-mh.provo.novell.com> <55B11CBE.8050704@citrix.com> <55B1208C.9090207@citrix.com> <55B211D70200007800094FCF@prv-mh.provo.novell.com> <55B20C9D.5070708@citrix.com> <55B233890200007800095124@prv-mh.provo.novell.com> <55B22B5A.8020304@citrix.com> <55B24F4A02000078000952CC@prv-mh.provo.novell.com> <55B25941.9@citrix.com> <55B27AAB02000078000954A0@prv-mh.provo.novell.com> <55B26DB1.4020302@citrix.com> <55BF9CE6.1030904@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZMJaE-0001rb-BF for xen-devel@lists.xenproject.org; Mon, 03 Aug 2015 17:31:30 +0000 In-Reply-To: <55BF9CE6.1030904@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Andrew Cooper , Jan Beulich Cc: xen-devel@lists.xenproject.org, Boris Ostrovsky , David Vrabel , Ian Campbell List-Id: xen-devel@lists.xenproject.org El 03/08/15 a les 18.55, Andrew Cooper ha escrit: > On 24/07/15 17:54, Roger Pau Monn=E9 wrote: >> >>>> struct vcpu_hvm_context { >>>> /* 16bit fields of the strucutre will be used. */ >>>> #define _VCPUHVM_MODE_16B 0 >>>> #define VCPUHVM_MODE_16B (1<<_VCPUHVM_MODE_16B) >>>> /* 32bit fields of the structure will be used. */ >>>> #define _VCPUHVM_MODE_32B 1 >>>> #define VCPUHVM_MODE_32B (1<<_VCPUHVM_MODE_32B) >>>> /* 64bit fields of the structure will be used. */ >>>> #define _VCPUHVM_MODE_64B 2 >>>> #define VCPUHVM_MODE_64B (1<<_VCPUHVM_MODE_64B) >>> As soon as we have three values here, a boolean flag approach >>> doesn't make sense anymore. >> Yes, taking into account that only _one_ of them can be used at the same >> time. >> >> I have the feeling that we are over engineering this interface. IMHO we >> should only allow the user to set the control registers, efer (on amd64) >> and the GP registers. Segment selectors would be set by Xen to point to >> a flat segment suitable for the mode the guest has selected. I think >> that should be enough to get the guest OS into it's entry point, and >> then it can do whatever it wants. >> >> If that's not suitable, then my second option would be to allow the >> guest to set the base, limit and AR bytes of the selectors, but not load >> a GDT. > = > In an ideal world, it would be nice to pass enough state to be able to > point eip at the idle loop and have everything JustWork. However, this > makes a lot of assumptions about exactly which state the vcpu wants to > set up, which might include MSRs as well. > > Therefore, an appropriate compromise is to be able to point eip at > startup_cpu, running in the correct operating mode to avoid bouncing > through trampolines. IMHO, allowing OSes to jump straight into the idle loop is putting a = lot of burden in the Xen side of things. Also all OSes already contain = AP startup code, we should be able to hook into this code at a suitable = location, and that's what we should aim for. >>>From that point of view, the minimum quantity of state required is: > = > CR{0,3,4}, EFER, cs/eip, ds, ss/esp, gdt{base,limit}. It would be > natural to extend this to all the 8 base GPRs and the user segment > selectors. (Note that EFER is needed even for 32bit paged entries if NX > is set in the pagetables.) Extending this to the rest of the GPR is needed for Linux to boot. The = AP startup code of PVH Linux places the cpu id in a GPR for the boot = stub to find it: http://lxr.free-electrons.com/source/arch/x86/xen/smp.c#L420 > = > In addition, a hint to Xen as to the eventual mode so it can widen its > reads to start with rather than attempting to reverse engineer the > operating mode and rereading half the structure later. = > = > Specifying the gdt{base,limit} will require Xen to read the GDT to set > up the entry attributes for the segment, as simply setting the selector > is insufficient. A complicating factor is that VT-x and SVM have > different representations for access rights. (For migration, VT-x's > representation is mutated to match SVM.) > = > I really undecided between suggesting a gdt{base,limit} and Xen reading > the access rights in an unambiguous fashon, vs specifying the internals > and avoiding any peeking into domain memory. > = > It would be nice if the new cpus gs base could be set by the caller, > which allows the new cpu straight access to its per-cpu data area (if gs > is in use). This is not representable using the gdt method and would > involve casing the GSBASE MSR if we were to go down the strict > architectural route. IMHO, this is out of the scope and I would rather allow the caller to = set the internal (cached) part of the segment selectors. > Therefore, I am leaning slightly towards the specify the internals side > of things, which removes some complexity from the Xen side of the hyperca= ll. I've updated the proposed structure, so now it looks like: (I still need to figure out how to order the bits in the access rights = (_ar) fields.) struct vcpu_hvm_x86_16 { uint16_t ax; uint16_t cx; uint16_t dx; uint16_t bx; uint16_t sp; uint16_t bp; uint16_t si; uint16_t di; uint16_t ip; uint32_t cr[8]; uint32_t cs_base; uint32_t ds_base; uint32_t ss_base; uint32_t cs_limit; uint32_t ds_limit; uint32_t ss_limit; uint16_t cs_ar; uint16_t ds_ar; uint16_t ss_ar; }; struct vcpu_hvm_x86_32 { uint32_t eax; uint32_t ecx; uint32_t edx; uint32_t ebx; uint32_t esp; uint32_t ebp; uint32_t esi; uint32_t edi; uint32_t eip; uint32_t cr[8]; uint64_t efer; uint32_t cs_base; uint32_t ds_base; uint32_t ss_base; uint32_t cs_limit; uint32_t ds_limit; uint32_t ss_limit; uint16_t cs_ar; uint16_t ds_ar; uint16_t ss_ar; }; struct vcpu_hvm_x86_64 { uint64_t rax; uint64_t rcx; uint64_t rdx; uint64_t rbx; uint64_t rsp; uint64_t rbp; uint64_t rsi; uint64_t rdi; uint64_t r8; uint64_t r9; uint64_t r10; uint64_t r11; uint64_t r12; uint64_t r13; uint64_t r14; uint64_t r15; uint64_t rip; uint64_t cr[8]; uint64_t efer; }; typedef enum { VCPU_HVM_MODE_16B, /* 16bit fields of the structure will be used.= */ VCPU_HVM_MODE_32B, /* 32bit fields of the structure will be used.= */ VCPU_HVM_MODE_64B, /* 64bit fields of the structure will be used.= */ } vcpu_hvm_mode_t; struct vcpu_hvm_context { vcpu_hvm_mode_t mode; /* CPU registers. */ union { struct vcpu_hvm_x86_16 x86_16; struct vcpu_hvm_x86_32 x86_32; struct vcpu_hvm_x86_64 x86_64; }; }; typedef struct vcpu_hvm_context vcpu_hvm_context_t; DEFINE_XEN_GUEST_HANDLE(vcpu_hvm_context_t);