From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marcelo Tosatti Subject: Re: [PATCH 2/2] KVM: fix missing "illegal instruction"-trap in protected modes Date: Thu, 12 Jan 2012 08:47:56 -0200 Message-ID: <20120112104756.GB31635@amt.cnet> References: <4F0C0EBB.3090506@tu-ilmenau.de> <4F0C1369.9070607@redhat.com> <4F0C2C4E.3000703@tu-ilmenau.de> <4F0C3044.7050307@redhat.com> <4F0C33A0.6080509@tu-ilmenau.de> <4F0C4AA9.6000203@tu-ilmenau.de> <20120111190927.GA13298@amt.cnet> <4F0DEA86.90503@tu-ilmenau.de> <20120111212150.GA18948@amt.cnet> <4F0E0B06.2090708@tu-ilmenau.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm@vger.kernel.org To: Stephan =?iso-8859-1?Q?B=E4rwolf?= Return-path: Received: from mx1.redhat.com ([209.132.183.28]:1345 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752340Ab2ALKtH (ORCPT ); Thu, 12 Jan 2012 05:49:07 -0500 Content-Disposition: inline In-Reply-To: <4F0E0B06.2090708@tu-ilmenau.de> Sender: kvm-owner@vger.kernel.org List-ID: On Wed, Jan 11, 2012 at 11:19:50PM +0100, Stephan B=E4rwolf wrote: > On 01/11/12 22:21, Marcelo Tosatti wrote: > > On Wed, Jan 11, 2012 at 09:01:10PM +0100, Stephan B=E4rwolf wrote: > >> On 01/11/12 20:09, Marcelo Tosatti wrote: > >>> On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan B=E4rwolf wrote= : > >>>> >From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:= 00 2001 > >>>> From: Stephan Baerwolf > >>>> Date: Sun, 8 Jan 2012 02:03:47 +0000 > >>>> Subject: [PATCH 2/2] KVM: fix missing "illegal instruction"-trap= in > >>>> protected modes > >>>> > >>>> On hosts without this patch, 32bit guests will crash > >>>> (and 64bit guests may behave in a wrong way) for > >>>> example by simply executing following nasm-demo-application: > >>>> > >>>> [bits 32] > >>>> global _start > >>>> SECTION .text > >>>> _start: syscall > >>>> > >>>> (I tested it with winxp and linux - both always crashed) > >>>> > >>>> Disassembly of section .text: > >>>> > >>>> 00000000 <_start>: > >>>> 0: 0f 05 syscall > >>>> > >>>> The reason seems a missing "invalid opcode"-trap (int6) for the > >>>> syscall opcode "0f05", which is not available on Intel CPUs > >>>> within non-longmodes, as also on some AMD CPUs within legacy-mod= e. > >>>> (depending on CPU vendor, MSR_EFER and cpuid) > >>>> > >>>> Because previous mentioned OSs may not engage corresponding > >>>> syscall target-registers (STAR, LSTAR, CSTAR), they remain > >>>> NULL and (non trapping) syscalls are leading to multiple > >>>> faults and finally crashs. > >>>> > >>>> Depending on the architecture (AMD or Intel) pretended by > >>>> guests, various checks according to vendor's documentation > >>>> are implemented to overcome the current issue and behave > >>>> like the CPUs physical counterparts. > >>>> > >>>> (Therefore using Intel's "Intel 64 and IA-32 Architecture Softwa= re > >>>> Developers Manual" http://www.intel.com/content/dam/doc/manual/ > >>>> 64-ia-32-architectures-software-developer-manual-325462.pdf > >>>> and AMD's "AMD64 Architecture Programmer's Manual Volume 3: > >>>> General-Purpose and System Instructions" > >>>> http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf ) > >>>> > >>>> Screenshots of an i686 testing VM (CORE i5 host) before > >>>> and after applying this patch are available under: > >>>> > >>>> http://matrixstorm.com/software/linux/kvm/20111229/before.jpg > >>>> http://matrixstorm.com/software/linux/kvm/20111229/after.jpg > >>>> > >>>> Signed-off-by: Stephan Baerwolf > >>>> --- > >>>> arch/x86/include/asm/kvm_emulate.h | 15 ++++++ > >>>> arch/x86/kvm/emulate.c | 92 > >>>> ++++++++++++++++++++++++++++++++++- > >>>> 2 files changed, 104 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/arch/x86/include/asm/kvm_emulate.h > >>>> b/arch/x86/include/asm/kvm_emulate.h > >>>> index b172bf4..5b68c23 100644 > >>>> --- a/arch/x86/include/asm/kvm_emulate.h > >>>> +++ b/arch/x86/include/asm/kvm_emulate.h > >>>> @@ -301,6 +301,21 @@ struct x86_emulate_ctxt { > >>>> #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE= _PROT32| \ > >>>> X86EMUL_MODE_PROT64) > >>>> =20 > >>>> +/* CPUID vendors */ > >>>> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541 > >>>> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163 > >>>> +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65 > >>>> + > >>>> +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41 > >>>> +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574 > >>>> +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273 > >>>> + > >>>> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547 > >>>> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e > >>>> +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69 > >>>> + > >>>> + > >>>> + > >>>> enum x86_intercept_stage { > >>>> X86_ICTP_NONE =3D 0, /* Allow zero-init to not match anyt= hing */ > >>>> X86_ICPT_PRE_EXCEPT, > >>>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > >>>> index f1e3be1..3357411 100644 > >>>> --- a/arch/x86/kvm/emulate.c > >>>> +++ b/arch/x86/kvm/emulate.c > >>>> @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulat= e_ctxt > >>>> *ctxt, > >>>> ss->p =3D 1; > >>>> } > >>>> =20 > >>>> +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt) > >>>> +{ > >>>> + struct x86_emulate_ops *ops =3D ctxt->ops; > >>>> + u64 efer =3D 0; > >>>> + > >>>> + /* syscall is not available in real mode = */ > >>>> + if ((ctxt->mode =3D=3D X86EMUL_MODE_REAL) || > >>>> + (ctxt->mode =3D=3D X86EMUL_MODE_VM86)) > >>>> + return false; > >>>> + > >>>> + ops->get_msr(ctxt, MSR_EFER, &efer); > >>>> + /* check - if guestOS is aware of syscall (0x0f05) = */ > >>>> + if ((efer & EFER_SCE) =3D=3D 0) { > >>>> + return false; > >>>> + } else { > >>>> + /* ok, at this point it becomes vendor-specific = */ > >>>> + /* so first get us an cpuid = */ > >>>> + bool vendor; > >>>> + u32 eax, ebx, ecx, edx; > >>>> + > >>>> + /* getting the cpu-vendor = */ > >>>> + eax =3D 0x00000000; > >>>> + ecx =3D 0x00000000; > >>>> + if (likely(ops->get_cpuid)) > >>>> + vendor =3D ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &ed= x); > >>>> + else vendor =3D false; > >>>> + > >>>> + if (likely(vendor)) { > >>>> + > >>>> + /* AMD AuthenticAMD / AMDisbetter! = */ > >>>> + if (((ebx=3D=3DX86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) &= & > >>>> + (ecx=3D=3DX86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) &= & > >>>> + (edx=3D=3DX86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) = || > >>>> + ((ebx=3D=3DX86EMUL_CPUID_VENDOR_AMDisbetter_ebx) && > >>>> + (ecx=3D=3DX86EMUL_CPUID_VENDOR_AMDisbetter_ecx) && > >>>> + (edx=3D=3DX86EMUL_CPUID_VENDOR_AMDisbetter_edx))) = { > >>>> + > >>>> + /* if cpu is not in longmode... = */ > >>>> + /* ...check edx bit11 of cpuid 0x80000001 = */ > >>>> + if (ctxt->mode !=3D X86EMUL_MODE_PROT64) { > >>>> + eax =3D 0x80000001; > >>>> + ecx =3D 0x00000000; > >>>> + vendor =3D ops->get_cpuid(ctxt, &eax, &ebx, &ec= x, &edx); > >>>> + if (likely(vendor)) { > >>>> + if (unlikely(((edx >> 11) & 0x1) =3D=3D 0)) > >>>> + return false; > >>>> + } else { > >>>> + return false; /* assuming there is no bit 1= 1 */ > >>>> + } > >>>> + } > >>>> + goto __em_syscall_enabled_noprotest; > >>>> + } /* end "AMD" */ > >>>> + > >>>> + > >>>> + /* Intel (GenuineIntel) = */ > >>>> + /* remarks: Intel CPUs only support "syscall" in 64bit = longmode */ > >>>> + /* Also an 64bit guest within a 32bit compat-a= pp running*/ > >>>> + /* will #UD !! = */ > >>>> + /* While this behaviour can be fixed (by emula= ting) into*/ > >>>> + /* an AMD response - CPUs of AMD can't behave = like Intel*/ > >>>> + /* because without an hardware-raised #UD ther= e is no */ > >>>> + /* call in em.-mode (see x86_emulate_instructi= on(...))! */ > >>>> + /* TODO: make AMD-behaviour configurable = */ > >>>> + if ((ebx=3D=3DX86EMUL_CPUID_VENDOR_GenuineIntel_ebx) && > >>>> + (ecx=3D=3DX86EMUL_CPUID_VENDOR_GenuineIntel_ecx) && > >>>> + (edx=3D=3DX86EMUL_CPUID_VENDOR_GenuineIntel_edx)) { > >>>> + if (ctxt->mode !=3D X86EMUL_MODE_PROT64) > >>>> + return false; > >>>> + goto __em_syscall_enabled_noprotest; > >>>> + } /* end "Intel" */ > >>>> + > >>>> + } /* end vendor is true */ > >>>> + > >>>> + } /* end MSR_EFER check */ > >>>> + > >>>> + /* default: */ > >>>> + /* this code wasn't able to process vendor */ > >>>> + /* so apply Intels stricter rules... */ > >>>> + pr_err_ratelimited("kvm: %i: unknown vcpu vendor - assu= ming > >>>> intel\n", > >>>> + current->tgid); > >>>> + > >>>> + if (ctxt->mode =3D=3D X86EMUL_MODE_PROT64) goto > >>>> __em_syscall_enabled_noprotest; > >>>> + return false; > >>>> + > >>>> +__em_syscall_enabled_noprotest: > >>>> + return true; > >>>> +} > >>>> + > >>>> static int em_syscall(struct x86_emulate_ctxt *ctxt) > >>>> { > >>>> struct x86_emulate_ops *ops =3D ctxt->ops; > >>>> @@ -1885,9 +1973,7 @@ static int em_syscall(struct x86_emulate_c= txt *ctxt) > >>>> u16 cs_sel, ss_sel; > >>>> u64 efer =3D 0; > >>>> =20 > >>>> - /* syscall is not available in real mode */ > >>>> - if (ctxt->mode =3D=3D X86EMUL_MODE_REAL || > >>>> - ctxt->mode =3D=3D X86EMUL_MODE_VM86) > >>>> + if (!(em_syscall_isenabled(ctxt))) > >>>> return emulate_ud(ctxt); > >>>> =20 > >>>> ops->get_msr(ctxt, MSR_EFER, &efer); > >>> Stephan,=20 > >>> > >>> I think only the 2-line check to inject UD if EFER_SCE is not set= is > >>> missing in em_syscall. The guest is responsible for setting a pro= per > >>> MSR_STAR for EIP only if it enables SCE_EFER. > >>> > >>> Can you please test & resend that patch? Thanks. > >>> > >>> > >>> > >> This wouln't work correctly on 64bit longmode guest > >> running an app in 32bit compat. > > The AMD pseudo-code, in volume 3 "General-Purpose and System > > Instructions", says: > > > > SYSCALL_START: > > > > IF (MSR_EFER.SCE =3D 0) // Check if syscall/sysret are enabled. > > EXCEPTION [#UD] > > > > IF (LONG_MODE) > > SYSCALL_LONG_MODE > > ELSE // (LEGACY_MODE) > > SYSCALL_LEGACY_MODE > > > >> (Even on AMD in some special conditions.) > >> The reason is, in 64Bit compat the MSR_EFER_SCE is still > >> enabled but syscall is not always available by cpu in such modes. > >> (I also compared always with a physical computer's behaviour.) > >> See Intel or even AMD-doku > >> (http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf) pa= ge 376. > >> > >> Of course in 64bit it should NOT crash (and indeed I haven't > >> observed it, yet) but if I have cpuid GenuineIntel it should behav= e > >> like Intel cpu and not like some special AMD. > >> (Otherwise I could overwrite the vendor and then the patch will > >> emulate AMD...) > >> > >> ??What could be a good speedup: Reordering the EFER check and the > >> real-mode checks > >> (I would have to check doku, but EFER is 0 in realmode, isn't it?)= , then > >> checking if (not)longmode and then: > >> > >> I think, the patch ->only<- (and only!?) becomes a little bit slow= er > >> exactly > >> in this situation (32bit compat under 64bit long) because in every= other > >> case? > >> the first ifs (testing the MSR_EFER and the mode) should decide to= #UD... > >> (At all it should be faster at the end, then it is now ?) > >> > >> ...I'll prepare a patch for what I mean... > >> > >> I am looking forward to your response, > >> > >> regards Stephan > > How about this: > > > > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > > index 05a562b..4e8e534 100644 > > --- a/arch/x86/kvm/emulate.c > > +++ b/arch/x86/kvm/emulate.c > > @@ -1907,6 +1907,9 @@ static int em_syscall(struct x86_emulate_ctxt= *ctxt) > > ops->get_msr(ctxt, MSR_EFER, &efer); > > setup_syscalls_segments(ctxt, &cs, &ss); > > =20 > > + if (!(efer & EFER_SCE)) > > + return emulate_ud(ctxt); > > + > > ops->get_msr(ctxt, MSR_STAR, &msr_data); > > msr_data >>=3D 32; > > cs_sel =3D (u16)(msr_data & 0xfffc); > > > Hi Marcelo, > thank you for your fast response. >=20 > Yes, this would be more elegant, ->but<- not exact at least on Intel. > (http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-s= oftware-developer-manual-325462.pdf) > 4-586 Vol. 2B or PDF-page 1804 > "[...] (* Not in 64-Bit Mode or SYSCALL/SYSRET not enabled in IA32_EF= ER > *) [...]" >=20 > We have to check somehow cpu-vendor to get info what cpu should be > emulated... > ...I'll prepare a patch by tomorrow reordering things and compacting > checks on AMD's branch. > (obviously cpuid 0x80000001 edx bit 11 needs not to be checked, as fa= r > as the pseudocode tells... ) Right. Please have the EFER check as i suggested (its cleaner), and then only this check for Intel should suffice: + /* TODO: make AMD-behaviour configurable */ + if ((ebx=3D=3DX86EMUL_CPUID_VENDOR_GenuineIntel_ebx) && + (ecx=3D=3DX86EMUL_CPUID_VENDOR_GenuineIntel_ecx) && + (edx=3D=3DX86EMUL_CPUID_VENDOR_GenuineIntel_edx)) { + if (ctxt->mode !=3D X86EMUL_MODE_PROT64) Please mind Documentation/CodingStyle. Thanks.