* [Xen-devel] [PATCH 0/6] x86emul: further work @ 2019-07-01 11:47 Jan Beulich 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich ` (6 more replies) 0 siblings, 7 replies; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:47 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, Roger Pau Monne On top of the AVX512 series I'd like to put out for review/discussion 1: generalize wbinvd() hook 2: support WBNOINVD 3: generalize invlpg() hook 4: move INVPCID_TYPE_* to x86-defns.h 5: support INVPCID 6: support MOVDIR{I,64B} insns Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich @ 2019-07-01 11:56 ` Jan Beulich 2019-07-02 10:22 ` Paul Durrant ` (2 more replies) 2019-07-01 11:56 ` [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD Jan Beulich ` (5 subsequent siblings) 6 siblings, 3 replies; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:56 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, RogerPau Monne The hook is already in use for other purposes, and emulating e.g. CLFLUSH by issuing WBINVD is, well, not very nice. Rename the hook and add parameters. Use lighter weight flushing insns when possible in hvmemul_cache_op(). hvmemul_cache_op() treating x86emul_invd the same as x86emul_wbinvd is to retain original behavior, but I'm not sure this is what we want in the long run. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Use cache_op() as hook name. Convert macros to inline functions in system.h. Re-base. --- I was unsure about PREFETCH* and CLDEMOTE - both are cache management insns too, but the emulator currently treats them as a NOP without invoking any hooks. I was also uncertain about the new cache_flush_permitted() instance - generally I think it wouldn't be too bad if we allowed line flushes in all cases, in which case the checks in the ->wbinvd_intercept() handlers would suffice (as they did until now). --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -382,10 +382,13 @@ static int fuzz_invlpg( return maybe_fail(ctxt, "invlpg", false); } -static int fuzz_wbinvd( +static int fuzz_cache_op( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { - return maybe_fail(ctxt, "wbinvd", true); + return maybe_fail(ctxt, "cache-management", true); } static int fuzz_write_io( @@ -620,7 +623,7 @@ static const struct x86_emulate_ops all_ SET(read_xcr), SET(read_msr), SET(write_msr), - SET(wbinvd), + SET(cache_op), SET(invlpg), .get_fpu = emul_test_get_fpu, .put_fpu = emul_test_put_fpu, @@ -729,7 +732,7 @@ enum { HOOK_read_xcr, HOOK_read_msr, HOOK_write_msr, - HOOK_wbinvd, + HOOK_cache_op, HOOK_cpuid, HOOK_inject_hw_exception, HOOK_inject_sw_interrupt, @@ -773,7 +776,7 @@ static void disable_hooks(struct x86_emu MAYBE_DISABLE_HOOK(read_xcr); MAYBE_DISABLE_HOOK(read_msr); MAYBE_DISABLE_HOOK(write_msr); - MAYBE_DISABLE_HOOK(wbinvd); + MAYBE_DISABLE_HOOK(cache_op); MAYBE_DISABLE_HOOK(cpuid); MAYBE_DISABLE_HOOK(get_fpu); MAYBE_DISABLE_HOOK(invlpg); --- a/xen/arch/x86/Rules.mk +++ b/xen/arch/x86/Rules.mk @@ -19,7 +19,9 @@ $(call as-option-add,CFLAGS,CC,"crc32 %e $(call as-option-add,CFLAGS,CC,"invept (%rax)$$(comma)%rax",-DHAVE_AS_EPT) $(call as-option-add,CFLAGS,CC,"rdrand %eax",-DHAVE_AS_RDRAND) $(call as-option-add,CFLAGS,CC,"rdfsbase %rax",-DHAVE_AS_FSGSBASE) +$(call as-option-add,CFLAGS,CC,"xsaveopt (%rax)",-DHAVE_AS_XSAVEOPT) $(call as-option-add,CFLAGS,CC,"rdseed %eax",-DHAVE_AS_RDSEED) +$(call as-option-add,CFLAGS,CC,"clwb (%rax)",-DHAVE_AS_CLWB) $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \ -U__OBJECT_LABEL__ -DHAVE_AS_QUOTED_SYM \ '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@') --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -25,6 +25,7 @@ #include <asm/hvm/trace.h> #include <asm/hvm/support.h> #include <asm/hvm/svm/svm.h> +#include <asm/iocap.h> #include <asm/vm_event.h> static void hvmtrace_io_assist(const ioreq_t *p) @@ -555,16 +556,12 @@ static void *hvmemul_map_linear_addr( mfn_t *mfn = &hvmemul_ctxt->mfn[0]; /* - * The caller has no legitimate reason for trying a zero-byte write, but - * all other code here is written to work if the check below was dropped. - * - * The maximum write size depends on the number of adjacent mfns[] which + * The maximum access size depends on the number of adjacent mfns[] which * can be vmap()'d, accouting for possible misalignment within the region. * The higher level emulation callers are responsible for ensuring that - * mfns[] is large enough for the requested write size. + * mfns[] is large enough for the requested access size. */ - if ( bytes == 0 || - nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) + if ( nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) { ASSERT_UNREACHABLE(); goto unhandleable; @@ -669,8 +666,6 @@ static void hvmemul_unmap_linear_addr( unsigned int i; mfn_t *mfn = &hvmemul_ctxt->mfn[0]; - ASSERT(bytes > 0); - if ( nr_frames == 1 ) unmap_domain_page(mapping); else @@ -1473,7 +1468,10 @@ static int hvmemul_write_msr_discard( return X86EMUL_OKAY; } -static int hvmemul_wbinvd_discard( +static int hvmemul_cache_op_discard( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { return X86EMUL_OKAY; @@ -2149,10 +2147,65 @@ static int hvmemul_write_msr( return rc; } -static int hvmemul_wbinvd( +static int hvmemul_cache_op( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { - alternative_vcall(hvm_funcs.wbinvd_intercept); + struct hvm_emulate_ctxt *hvmemul_ctxt = + container_of(ctxt, struct hvm_emulate_ctxt, ctxt); + unsigned long addr, reps = 1; + uint32_t pfec = PFEC_page_present; + int rc; + void *mapping; + + if ( !cache_flush_permitted(current->domain) ) + return X86EMUL_OKAY; + + switch ( op ) + { + case x86emul_clflush: + case x86emul_clflushopt: + case x86emul_clwb: + ASSERT(!is_x86_system_segment(seg)); + + rc = hvmemul_virtual_to_linear(seg, offset, 0, &reps, + hvm_access_read, hvmemul_ctxt, &addr); + if ( rc != X86EMUL_OKAY ) + break; + + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) + pfec |= PFEC_user_mode; + + mapping = hvmemul_map_linear_addr(addr, 0, pfec, hvmemul_ctxt, + current->arch.hvm.data_cache); + if ( mapping == ERR_PTR(~X86EMUL_EXCEPTION) ) + return X86EMUL_EXCEPTION; + if ( IS_ERR_OR_NULL(mapping) ) + break; + + if ( cpu_has_clflush ) + { + if ( op == x86emul_clwb && cpu_has_clwb ) + clwb(mapping); + else if ( op == x86emul_clflushopt && cpu_has_clflushopt ) + clflushopt(mapping); + else + clflush(mapping); + + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); + break; + } + + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); + /* fall through */ + case x86emul_invd: + case x86emul_wbinvd: + alternative_vcall(hvm_funcs.wbinvd_intercept); + break; + } + return X86EMUL_OKAY; } @@ -2353,7 +2406,7 @@ static const struct x86_emulate_ops hvm_ .write_xcr = hvmemul_write_xcr, .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr, - .wbinvd = hvmemul_wbinvd, + .cache_op = hvmemul_cache_op, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, @@ -2380,7 +2433,7 @@ static const struct x86_emulate_ops hvm_ .write_xcr = hvmemul_write_xcr, .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr_discard, - .wbinvd = hvmemul_wbinvd_discard, + .cache_op = hvmemul_cache_op_discard, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1118,9 +1118,11 @@ static int write_msr(unsigned int reg, u return X86EMUL_UNHANDLEABLE; } -/* Name it differently to avoid clashing with wbinvd() */ -static int _wbinvd(struct x86_emulate_ctxt *ctxt) +static int cache_op(enum x86emul_cache_op op, enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { + ASSERT(op == x86emul_wbinvd); + /* Ignore the instruction if unprivileged. */ if ( !cache_flush_permitted(current->domain) ) /* @@ -1238,7 +1240,7 @@ static const struct x86_emulate_ops priv .read_msr = read_msr, .write_msr = write_msr, .cpuid = x86emul_cpuid, - .wbinvd = _wbinvd, + .cache_op = cache_op, }; int pv_emulate_privileged_op(struct cpu_user_regs *regs) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -5933,8 +5933,11 @@ x86_emulate( case X86EMUL_OPC(0x0f, 0x08): /* invd */ case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */ generate_exception_if(!mode_ring0(), EXC_GP, 0); - fail_if(ops->wbinvd == NULL); - if ( (rc = ops->wbinvd(ctxt)) != 0 ) + fail_if(!ops->cache_op); + if ( (rc = ops->cache_op(b == 0x09 ? x86emul_wbinvd + : x86emul_invd, + x86_seg_none, 0, + ctxt)) != X86EMUL_OKAY ) goto done; break; @@ -7801,8 +7804,9 @@ x86_emulate( /* else clwb */ fail_if(!vex.pfx); vcpu_must_have(clwb); - fail_if(!ops->wbinvd); - if ( (rc = ops->wbinvd(ctxt)) != X86EMUL_OKAY ) + fail_if(!ops->cache_op); + if ( (rc = ops->cache_op(x86emul_clwb, ea.mem.seg, ea.mem.off, + ctxt)) != X86EMUL_OKAY ) goto done; break; case 7: @@ -7818,8 +7822,11 @@ x86_emulate( vcpu_must_have(clflush); else vcpu_must_have(clflushopt); - fail_if(ops->wbinvd == NULL); - if ( (rc = ops->wbinvd(ctxt)) != 0 ) + fail_if(!ops->cache_op); + if ( (rc = ops->cache_op(vex.pfx ? x86emul_clflushopt + : x86emul_clflush, + ea.mem.seg, ea.mem.off, + ctxt)) != X86EMUL_OKAY ) goto done; break; default: --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -176,6 +176,14 @@ enum x86_emulate_fpu_type { X86EMUL_FPU_none }; +enum x86emul_cache_op { + x86emul_clflush, + x86emul_clflushopt, + x86emul_clwb, + x86emul_invd, + x86emul_wbinvd, +}; + struct x86_emulate_state; /* @@ -452,8 +460,15 @@ struct x86_emulate_ops uint64_t val, struct x86_emulate_ctxt *ctxt); - /* wbinvd: Write-back and invalidate cache contents. */ - int (*wbinvd)( + /* + * cache_op: Write-back and/or invalidate cache contents. + * + * @seg:@offset applicable only to some of enum x86emul_cache_op. + */ + int (*cache_op)( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt); /* cpuid: Emulate CPUID via given set of EAX-EDX inputs/outputs. */ --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -102,6 +102,8 @@ #define cpu_has_rdseed boot_cpu_has(X86_FEATURE_RDSEED) #define cpu_has_smap boot_cpu_has(X86_FEATURE_SMAP) #define cpu_has_avx512_ifma boot_cpu_has(X86_FEATURE_AVX512_IFMA) +#define cpu_has_clflushopt boot_cpu_has(X86_FEATURE_CLFLUSHOPT) +#define cpu_has_clwb boot_cpu_has(X86_FEATURE_CLWB) #define cpu_has_avx512er boot_cpu_has(X86_FEATURE_AVX512ER) #define cpu_has_avx512cd boot_cpu_has(X86_FEATURE_AVX512CD) #define cpu_has_sha boot_cpu_has(X86_FEATURE_SHA) --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -21,6 +21,23 @@ static inline void clflush(const void *p asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); } +static inline void clflushopt(const void *p) +{ + asm volatile ( "data16 clflush %0" :: "m" (*(const char *)p) ); +} + +static inline void clwb(const void *p) +{ +#if defined(HAVE_AS_CLWB) + asm volatile ( "clwb %0" :: "m" (*(const char *)p) ); +#elif defined(HAVE_AS_XSAVEOPT) + asm volatile ( "data16 xsaveopt %0" :: "m" (*(const char *)p) ); +#else + asm volatile ( ".byte 0x66, 0x0f, 0xae, 0x32" + :: "d" (p), "m" (*(const char *)p) ); +#endif +} + #define xchg(ptr,v) \ ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr)))) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich @ 2019-07-02 10:22 ` Paul Durrant 2019-07-02 10:50 ` Jan Beulich 2019-08-27 10:44 ` Andrew Cooper 2019-09-02 11:10 ` [Xen-devel] Ping: " Jan Beulich 2 siblings, 1 reply; 36+ messages in thread From: Paul Durrant @ 2019-07-02 10:22 UTC (permalink / raw) To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne > -----Original Message----- > From: Jan Beulich <JBeulich@suse.com> > Sent: 01 July 2019 12:56 > To: xen-devel@lists.xenproject.org > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Wei Liu <wl@xen.org> > Subject: [PATCH 1/6] x86emul: generalize wbinvd() hook > > The hook is already in use for other purposes, and emulating e.g. > CLFLUSH by issuing WBINVD is, well, not very nice. Rename the hook and > add parameters. Use lighter weight flushing insns when possible in > hvmemul_cache_op(). > > hvmemul_cache_op() treating x86emul_invd the same as x86emul_wbinvd is > to retain original behavior, but I'm not sure this is what we want in > the long run. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > v2: Use cache_op() as hook name. Convert macros to inline functions in > system.h. Re-base. > --- > I was unsure about PREFETCH* and CLDEMOTE - both are cache management > insns too, but the emulator currently treats them as a NOP without > invoking any hooks. > I was also uncertain about the new cache_flush_permitted() instance - > generally I think it wouldn't be too bad if we allowed line flushes in > all cases, in which case the checks in the ->wbinvd_intercept() handlers > would suffice (as they did until now). > [snip] > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -25,6 +25,7 @@ > #include <asm/hvm/trace.h> > #include <asm/hvm/support.h> > #include <asm/hvm/svm/svm.h> > +#include <asm/iocap.h> > #include <asm/vm_event.h> > > static void hvmtrace_io_assist(const ioreq_t *p) > @@ -555,16 +556,12 @@ static void *hvmemul_map_linear_addr( > mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > /* > - * The caller has no legitimate reason for trying a zero-byte write, but > - * all other code here is written to work if the check below was dropped. > - * > - * The maximum write size depends on the number of adjacent mfns[] which > + * The maximum access size depends on the number of adjacent mfns[] which > * can be vmap()'d, accouting for possible misalignment within the region. > * The higher level emulation callers are responsible for ensuring that > - * mfns[] is large enough for the requested write size. > + * mfns[] is large enough for the requested access size. > */ > - if ( bytes == 0 || > - nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) > + if ( nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) > { > ASSERT_UNREACHABLE(); > goto unhandleable; > @@ -669,8 +666,6 @@ static void hvmemul_unmap_linear_addr( > unsigned int i; > mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > - ASSERT(bytes > 0); > - > if ( nr_frames == 1 ) > unmap_domain_page(mapping); > else > @@ -1473,7 +1468,10 @@ static int hvmemul_write_msr_discard( > return X86EMUL_OKAY; > } > > -static int hvmemul_wbinvd_discard( > +static int hvmemul_cache_op_discard( > + enum x86emul_cache_op op, > + enum x86_segment seg, > + unsigned long offset, > struct x86_emulate_ctxt *ctxt) > { > return X86EMUL_OKAY; > @@ -2149,10 +2147,65 @@ static int hvmemul_write_msr( > return rc; > } > > -static int hvmemul_wbinvd( > +static int hvmemul_cache_op( > + enum x86emul_cache_op op, > + enum x86_segment seg, > + unsigned long offset, > struct x86_emulate_ctxt *ctxt) > { > - alternative_vcall(hvm_funcs.wbinvd_intercept); > + struct hvm_emulate_ctxt *hvmemul_ctxt = > + container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > + unsigned long addr, reps = 1; > + uint32_t pfec = PFEC_page_present; > + int rc; > + void *mapping; > + > + if ( !cache_flush_permitted(current->domain) ) > + return X86EMUL_OKAY; > + > + switch ( op ) > + { > + case x86emul_clflush: > + case x86emul_clflushopt: > + case x86emul_clwb: > + ASSERT(!is_x86_system_segment(seg)); > + > + rc = hvmemul_virtual_to_linear(seg, offset, 0, &reps, > + hvm_access_read, hvmemul_ctxt, &addr); > + if ( rc != X86EMUL_OKAY ) > + break; > + > + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) > + pfec |= PFEC_user_mode; > + > + mapping = hvmemul_map_linear_addr(addr, 0, pfec, hvmemul_ctxt, > + current->arch.hvm.data_cache); > + if ( mapping == ERR_PTR(~X86EMUL_EXCEPTION) ) > + return X86EMUL_EXCEPTION; > + if ( IS_ERR_OR_NULL(mapping) ) > + break; > + > + if ( cpu_has_clflush ) > + { > + if ( op == x86emul_clwb && cpu_has_clwb ) > + clwb(mapping); > + else if ( op == x86emul_clflushopt && cpu_has_clflushopt ) > + clflushopt(mapping); > + else > + clflush(mapping); > + > + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); > + break; > + } > + > + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); Since the mapping is ditched here, why bother getting one at all in the !cpu_has_clflush case? Are you trying to flush out an error condition that was previously missed? Paul _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-07-02 10:22 ` Paul Durrant @ 2019-07-02 10:50 ` Jan Beulich 2019-07-02 10:53 ` Paul Durrant 0 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-07-02 10:50 UTC (permalink / raw) To: Paul Durrant, xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne On 02.07.2019 12:22, Paul Durrant wrote: >> From: Jan Beulich <JBeulich@suse.com> >> Sent: 01 July 2019 12:56 >> >> @@ -2149,10 +2147,65 @@ static int hvmemul_write_msr( >> return rc; >> } >> >> -static int hvmemul_wbinvd( >> +static int hvmemul_cache_op( >> + enum x86emul_cache_op op, >> + enum x86_segment seg, >> + unsigned long offset, >> struct x86_emulate_ctxt *ctxt) >> { >> - alternative_vcall(hvm_funcs.wbinvd_intercept); >> + struct hvm_emulate_ctxt *hvmemul_ctxt = >> + container_of(ctxt, struct hvm_emulate_ctxt, ctxt); >> + unsigned long addr, reps = 1; >> + uint32_t pfec = PFEC_page_present; >> + int rc; >> + void *mapping; >> + >> + if ( !cache_flush_permitted(current->domain) ) >> + return X86EMUL_OKAY; >> + >> + switch ( op ) >> + { >> + case x86emul_clflush: >> + case x86emul_clflushopt: >> + case x86emul_clwb: >> + ASSERT(!is_x86_system_segment(seg)); >> + >> + rc = hvmemul_virtual_to_linear(seg, offset, 0, &reps, >> + hvm_access_read, hvmemul_ctxt, &addr); >> + if ( rc != X86EMUL_OKAY ) >> + break; >> + >> + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) >> + pfec |= PFEC_user_mode; >> + >> + mapping = hvmemul_map_linear_addr(addr, 0, pfec, hvmemul_ctxt, >> + current->arch.hvm.data_cache); >> + if ( mapping == ERR_PTR(~X86EMUL_EXCEPTION) ) >> + return X86EMUL_EXCEPTION; This return ... >> + if ( IS_ERR_OR_NULL(mapping) ) >> + break; >> + >> + if ( cpu_has_clflush ) >> + { >> + if ( op == x86emul_clwb && cpu_has_clwb ) >> + clwb(mapping); >> + else if ( op == x86emul_clflushopt && cpu_has_clflushopt ) >> + clflushopt(mapping); >> + else >> + clflush(mapping); >> + >> + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); >> + break; >> + } >> + >> + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); > > Since the mapping is ditched here, why bother getting one at all in the > !cpu_has_clflush case? Are you trying to flush out an error condition> that was previously missed? ... is what I'm after: We want exception behavior to be correct. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-07-02 10:50 ` Jan Beulich @ 2019-07-02 10:53 ` Paul Durrant 0 siblings, 0 replies; 36+ messages in thread From: Paul Durrant @ 2019-07-02 10:53 UTC (permalink / raw) To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne > -----Original Message----- > From: Jan Beulich <JBeulich@suse.com> > Sent: 02 July 2019 11:50 > To: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; Wei Liu > <wl@xen.org> > Subject: Re: [PATCH 1/6] x86emul: generalize wbinvd() hook > > On 02.07.2019 12:22, Paul Durrant wrote: > >> From: Jan Beulich <JBeulich@suse.com> > >> Sent: 01 July 2019 12:56 > >> > >> @@ -2149,10 +2147,65 @@ static int hvmemul_write_msr( > >> return rc; > >> } > >> > >> -static int hvmemul_wbinvd( > >> +static int hvmemul_cache_op( > >> + enum x86emul_cache_op op, > >> + enum x86_segment seg, > >> + unsigned long offset, > >> struct x86_emulate_ctxt *ctxt) > >> { > >> - alternative_vcall(hvm_funcs.wbinvd_intercept); > >> + struct hvm_emulate_ctxt *hvmemul_ctxt = > >> + container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > >> + unsigned long addr, reps = 1; > >> + uint32_t pfec = PFEC_page_present; > >> + int rc; > >> + void *mapping; > >> + > >> + if ( !cache_flush_permitted(current->domain) ) > >> + return X86EMUL_OKAY; > >> + > >> + switch ( op ) > >> + { > >> + case x86emul_clflush: > >> + case x86emul_clflushopt: > >> + case x86emul_clwb: > >> + ASSERT(!is_x86_system_segment(seg)); > >> + > >> + rc = hvmemul_virtual_to_linear(seg, offset, 0, &reps, > >> + hvm_access_read, hvmemul_ctxt, &addr); > >> + if ( rc != X86EMUL_OKAY ) > >> + break; > >> + > >> + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) > >> + pfec |= PFEC_user_mode; > >> + > >> + mapping = hvmemul_map_linear_addr(addr, 0, pfec, hvmemul_ctxt, > >> + current->arch.hvm.data_cache); > >> + if ( mapping == ERR_PTR(~X86EMUL_EXCEPTION) ) > >> + return X86EMUL_EXCEPTION; > > This return ... > > >> + if ( IS_ERR_OR_NULL(mapping) ) > >> + break; > >> + > >> + if ( cpu_has_clflush ) > >> + { > >> + if ( op == x86emul_clwb && cpu_has_clwb ) > >> + clwb(mapping); > >> + else if ( op == x86emul_clflushopt && cpu_has_clflushopt ) > >> + clflushopt(mapping); > >> + else > >> + clflush(mapping); > >> + > >> + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); > >> + break; > >> + } > >> + > >> + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); > > > > Since the mapping is ditched here, why bother getting one at all in the > > !cpu_has_clflush case? Are you trying to flush out an error condition> that was previously missed? > > ... is what I'm after: We want exception behavior to be correct. > Ok, fair enough. Just wasn't obvious to me from the commit comment. Paul _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich 2019-07-02 10:22 ` Paul Durrant @ 2019-08-27 10:44 ` Andrew Cooper 2019-08-27 12:51 ` Andrew Cooper 2019-08-27 18:47 ` Andrew Cooper 2019-09-02 11:10 ` [Xen-devel] Ping: " Jan Beulich 2 siblings, 2 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 10:44 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 01/07/2019 12:56, Jan Beulich wrote: > The hook is already in use for other purposes, and emulating e.g. > CLFLUSH by issuing WBINVD is, well, not very nice. Rename the hook and > add parameters. Use lighter weight flushing insns when possible in > hvmemul_cache_op(). > > hvmemul_cache_op() treating x86emul_invd the same as x86emul_wbinvd is > to retain original behavior, but I'm not sure this is what we want in > the long run. There is no use for INVD in a VM, as it is never running with the memory controllers yet-to-be configured. The only place it may be found is at the reset vector for a firmware which doesn't start in a virtualisation-aware way. Given how big a hammer WBINVD is, I reckon we should just nop it out. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > v2: Use cache_op() as hook name. Convert macros to inline functions in > system.h. Re-base. > --- > I was unsure about PREFETCH* and CLDEMOTE - both are cache management > insns too, but the emulator currently treats them as a NOP without > invoking any hooks. They are just hints. Nothing architecturally may depend on them having any effect. CLDEMOTE in particular is for reducing cache coherency overhead on a producer/consumer setup, but any potential optimisation is dwarfed by the VMExit. > I was also uncertain about the new cache_flush_permitted() instance - > generally I think it wouldn't be too bad if we allowed line flushes in > all cases, in which case the checks in the ->wbinvd_intercept() handlers > would suffice (as they did until now). This is a more general issue which we need to address. To support encrypted memory in VM's, we must guarantee that WC mappings which the guest creates are really WC, which means we must not use IPAT or play any "fall back to WB" games. Furthermore, AMD's encrypt-in-place algorithm requires the guest to be able to use WBINVD. Fixing this properly will be a good thing, and will fix the fact that at the moment, any time you give a PCI device to a guest, its blk/net perf becomes glacial, due to having the grant table being uncached. > + > + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) > + pfec |= PFEC_user_mode; As a note, this is fine here, but this whole system of choosing pfec needs to be adjusted when we add support for WRUSS, which is a CPL0 instruction that executed as a user mode write, for adjusting the user shadow stack. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-08-27 10:44 ` Andrew Cooper @ 2019-08-27 12:51 ` Andrew Cooper 2019-08-27 18:47 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 12:51 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 27/08/2019 11:44, Andrew Cooper wrote: >> I was also uncertain about the new cache_flush_permitted() instance - >> generally I think it wouldn't be too bad if we allowed line flushes in >> all cases, in which case the checks in the ->wbinvd_intercept() handlers >> would suffice (as they did until now). > This is a more general issue which we need to address. To support > encrypted memory in VM's, we must guarantee that WC mappings which the > guest creates are really WC, which means we must not use IPAT or play > any "fall back to WB" games. > > Furthermore, AMD's encrypt-in-place algorithm requires the guest to be > able to use WBINVD. Apologies. AMD's algorithm requires aliased WP and WB mappings, not WC. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-08-27 10:44 ` Andrew Cooper 2019-08-27 12:51 ` Andrew Cooper @ 2019-08-27 18:47 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 18:47 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 27/08/2019 11:44, Andrew Cooper wrote: >> I was also uncertain about the new cache_flush_permitted() instance - >> generally I think it wouldn't be too bad if we allowed line flushes in >> all cases, in which case the checks in the ->wbinvd_intercept() handlers >> would suffice (as they did until now). > This is a more general issue which we need to address. To support > encrypted memory in VM's, we must guarantee that WC mappings which the > guest creates are really WC, which means we must not use IPAT or play > any "fall back to WB" games. > > Furthermore, AMD's encrypt-in-place algorithm requires the guest to be > able to use WBINVD. Based on some feedback, the encrypt-in-place algorithm is only for native situations trying to use SME, where the loading kernel/hypervisor needs to encrypt itself. SEV guests run encrypted from the start, and have no need to encrypt/decrypt in place. The expected way is to copy between an encrypted and non-encrypted buffer in separate GFNs. The WBINVD is to ensure that dirty cache lines aren't corrupted by the WP mapping, and CLFLUSH (or any equivalent full eviction) can be substituted, but for the intended usecase, a single WBINVD is the optimum strategy. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] Ping: [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich 2019-07-02 10:22 ` Paul Durrant 2019-08-27 10:44 ` Andrew Cooper @ 2019-09-02 11:10 ` Jan Beulich 2019-09-02 12:04 ` Paul Durrant 2 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-09-02 11:10 UTC (permalink / raw) To: Paul Durrant; +Cc: xen-devel, RogerPau Monne, Wei Liu, Andrew Cooper On 01.07.2019 13:55, Jan Beulich wrote: > The hook is already in use for other purposes, and emulating e.g. > CLFLUSH by issuing WBINVD is, well, not very nice. Rename the hook and > add parameters. Use lighter weight flushing insns when possible in > hvmemul_cache_op(). > > hvmemul_cache_op() treating x86emul_invd the same as x86emul_wbinvd is > to retain original behavior, but I'm not sure this is what we want in > the long run. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Paul, any chance I could get your ack (or otherwise) here? I thought I did answer the one question you had raised to your satisfaction. Thanks, Jan > --- > v2: Use cache_op() as hook name. Convert macros to inline functions in > system.h. Re-base. > --- > I was unsure about PREFETCH* and CLDEMOTE - both are cache management > insns too, but the emulator currently treats them as a NOP without > invoking any hooks. > I was also uncertain about the new cache_flush_permitted() instance - > generally I think it wouldn't be too bad if we allowed line flushes in > all cases, in which case the checks in the ->wbinvd_intercept() handlers > would suffice (as they did until now). > > --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c > +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c > @@ -382,10 +382,13 @@ static int fuzz_invlpg( > return maybe_fail(ctxt, "invlpg", false); > } > > -static int fuzz_wbinvd( > +static int fuzz_cache_op( > + enum x86emul_cache_op op, > + enum x86_segment seg, > + unsigned long offset, > struct x86_emulate_ctxt *ctxt) > { > - return maybe_fail(ctxt, "wbinvd", true); > + return maybe_fail(ctxt, "cache-management", true); > } > > static int fuzz_write_io( > @@ -620,7 +623,7 @@ static const struct x86_emulate_ops all_ > SET(read_xcr), > SET(read_msr), > SET(write_msr), > - SET(wbinvd), > + SET(cache_op), > SET(invlpg), > .get_fpu = emul_test_get_fpu, > .put_fpu = emul_test_put_fpu, > @@ -729,7 +732,7 @@ enum { > HOOK_read_xcr, > HOOK_read_msr, > HOOK_write_msr, > - HOOK_wbinvd, > + HOOK_cache_op, > HOOK_cpuid, > HOOK_inject_hw_exception, > HOOK_inject_sw_interrupt, > @@ -773,7 +776,7 @@ static void disable_hooks(struct x86_emu > MAYBE_DISABLE_HOOK(read_xcr); > MAYBE_DISABLE_HOOK(read_msr); > MAYBE_DISABLE_HOOK(write_msr); > - MAYBE_DISABLE_HOOK(wbinvd); > + MAYBE_DISABLE_HOOK(cache_op); > MAYBE_DISABLE_HOOK(cpuid); > MAYBE_DISABLE_HOOK(get_fpu); > MAYBE_DISABLE_HOOK(invlpg); > --- a/xen/arch/x86/Rules.mk > +++ b/xen/arch/x86/Rules.mk > @@ -19,7 +19,9 @@ $(call as-option-add,CFLAGS,CC,"crc32 %e > $(call as-option-add,CFLAGS,CC,"invept (%rax)$$(comma)%rax",-DHAVE_AS_EPT) > $(call as-option-add,CFLAGS,CC,"rdrand %eax",-DHAVE_AS_RDRAND) > $(call as-option-add,CFLAGS,CC,"rdfsbase %rax",-DHAVE_AS_FSGSBASE) > +$(call as-option-add,CFLAGS,CC,"xsaveopt (%rax)",-DHAVE_AS_XSAVEOPT) > $(call as-option-add,CFLAGS,CC,"rdseed %eax",-DHAVE_AS_RDSEED) > +$(call as-option-add,CFLAGS,CC,"clwb (%rax)",-DHAVE_AS_CLWB) > $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \ > -U__OBJECT_LABEL__ -DHAVE_AS_QUOTED_SYM \ > '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@') > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -25,6 +25,7 @@ > #include <asm/hvm/trace.h> > #include <asm/hvm/support.h> > #include <asm/hvm/svm/svm.h> > +#include <asm/iocap.h> > #include <asm/vm_event.h> > > static void hvmtrace_io_assist(const ioreq_t *p) > @@ -555,16 +556,12 @@ static void *hvmemul_map_linear_addr( > mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > /* > - * The caller has no legitimate reason for trying a zero-byte write, but > - * all other code here is written to work if the check below was dropped. > - * > - * The maximum write size depends on the number of adjacent mfns[] which > + * The maximum access size depends on the number of adjacent mfns[] which > * can be vmap()'d, accouting for possible misalignment within the region. > * The higher level emulation callers are responsible for ensuring that > - * mfns[] is large enough for the requested write size. > + * mfns[] is large enough for the requested access size. > */ > - if ( bytes == 0 || > - nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) > + if ( nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) > { > ASSERT_UNREACHABLE(); > goto unhandleable; > @@ -669,8 +666,6 @@ static void hvmemul_unmap_linear_addr( > unsigned int i; > mfn_t *mfn = &hvmemul_ctxt->mfn[0]; > > - ASSERT(bytes > 0); > - > if ( nr_frames == 1 ) > unmap_domain_page(mapping); > else > @@ -1473,7 +1468,10 @@ static int hvmemul_write_msr_discard( > return X86EMUL_OKAY; > } > > -static int hvmemul_wbinvd_discard( > +static int hvmemul_cache_op_discard( > + enum x86emul_cache_op op, > + enum x86_segment seg, > + unsigned long offset, > struct x86_emulate_ctxt *ctxt) > { > return X86EMUL_OKAY; > @@ -2149,10 +2147,65 @@ static int hvmemul_write_msr( > return rc; > } > > -static int hvmemul_wbinvd( > +static int hvmemul_cache_op( > + enum x86emul_cache_op op, > + enum x86_segment seg, > + unsigned long offset, > struct x86_emulate_ctxt *ctxt) > { > - alternative_vcall(hvm_funcs.wbinvd_intercept); > + struct hvm_emulate_ctxt *hvmemul_ctxt = > + container_of(ctxt, struct hvm_emulate_ctxt, ctxt); > + unsigned long addr, reps = 1; > + uint32_t pfec = PFEC_page_present; > + int rc; > + void *mapping; > + > + if ( !cache_flush_permitted(current->domain) ) > + return X86EMUL_OKAY; > + > + switch ( op ) > + { > + case x86emul_clflush: > + case x86emul_clflushopt: > + case x86emul_clwb: > + ASSERT(!is_x86_system_segment(seg)); > + > + rc = hvmemul_virtual_to_linear(seg, offset, 0, &reps, > + hvm_access_read, hvmemul_ctxt, &addr); > + if ( rc != X86EMUL_OKAY ) > + break; > + > + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) > + pfec |= PFEC_user_mode; > + > + mapping = hvmemul_map_linear_addr(addr, 0, pfec, hvmemul_ctxt, > + current->arch.hvm.data_cache); > + if ( mapping == ERR_PTR(~X86EMUL_EXCEPTION) ) > + return X86EMUL_EXCEPTION; > + if ( IS_ERR_OR_NULL(mapping) ) > + break; > + > + if ( cpu_has_clflush ) > + { > + if ( op == x86emul_clwb && cpu_has_clwb ) > + clwb(mapping); > + else if ( op == x86emul_clflushopt && cpu_has_clflushopt ) > + clflushopt(mapping); > + else > + clflush(mapping); > + > + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); > + break; > + } > + > + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); > + /* fall through */ > + case x86emul_invd: > + case x86emul_wbinvd: > + alternative_vcall(hvm_funcs.wbinvd_intercept); > + break; > + } > + > return X86EMUL_OKAY; > } > > @@ -2353,7 +2406,7 @@ static const struct x86_emulate_ops hvm_ > .write_xcr = hvmemul_write_xcr, > .read_msr = hvmemul_read_msr, > .write_msr = hvmemul_write_msr, > - .wbinvd = hvmemul_wbinvd, > + .cache_op = hvmemul_cache_op, > .cpuid = x86emul_cpuid, > .get_fpu = hvmemul_get_fpu, > .put_fpu = hvmemul_put_fpu, > @@ -2380,7 +2433,7 @@ static const struct x86_emulate_ops hvm_ > .write_xcr = hvmemul_write_xcr, > .read_msr = hvmemul_read_msr, > .write_msr = hvmemul_write_msr_discard, > - .wbinvd = hvmemul_wbinvd_discard, > + .cache_op = hvmemul_cache_op_discard, > .cpuid = x86emul_cpuid, > .get_fpu = hvmemul_get_fpu, > .put_fpu = hvmemul_put_fpu, > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -1118,9 +1118,11 @@ static int write_msr(unsigned int reg, u > return X86EMUL_UNHANDLEABLE; > } > > -/* Name it differently to avoid clashing with wbinvd() */ > -static int _wbinvd(struct x86_emulate_ctxt *ctxt) > +static int cache_op(enum x86emul_cache_op op, enum x86_segment seg, > + unsigned long offset, struct x86_emulate_ctxt *ctxt) > { > + ASSERT(op == x86emul_wbinvd); > + > /* Ignore the instruction if unprivileged. */ > if ( !cache_flush_permitted(current->domain) ) > /* > @@ -1238,7 +1240,7 @@ static const struct x86_emulate_ops priv > .read_msr = read_msr, > .write_msr = write_msr, > .cpuid = x86emul_cpuid, > - .wbinvd = _wbinvd, > + .cache_op = cache_op, > }; > > int pv_emulate_privileged_op(struct cpu_user_regs *regs) > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -5933,8 +5933,11 @@ x86_emulate( > case X86EMUL_OPC(0x0f, 0x08): /* invd */ > case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */ > generate_exception_if(!mode_ring0(), EXC_GP, 0); > - fail_if(ops->wbinvd == NULL); > - if ( (rc = ops->wbinvd(ctxt)) != 0 ) > + fail_if(!ops->cache_op); > + if ( (rc = ops->cache_op(b == 0x09 ? x86emul_wbinvd > + : x86emul_invd, > + x86_seg_none, 0, > + ctxt)) != X86EMUL_OKAY ) > goto done; > break; > > @@ -7801,8 +7804,9 @@ x86_emulate( > /* else clwb */ > fail_if(!vex.pfx); > vcpu_must_have(clwb); > - fail_if(!ops->wbinvd); > - if ( (rc = ops->wbinvd(ctxt)) != X86EMUL_OKAY ) > + fail_if(!ops->cache_op); > + if ( (rc = ops->cache_op(x86emul_clwb, ea.mem.seg, ea.mem.off, > + ctxt)) != X86EMUL_OKAY ) > goto done; > break; > case 7: > @@ -7818,8 +7822,11 @@ x86_emulate( > vcpu_must_have(clflush); > else > vcpu_must_have(clflushopt); > - fail_if(ops->wbinvd == NULL); > - if ( (rc = ops->wbinvd(ctxt)) != 0 ) > + fail_if(!ops->cache_op); > + if ( (rc = ops->cache_op(vex.pfx ? x86emul_clflushopt > + : x86emul_clflush, > + ea.mem.seg, ea.mem.off, > + ctxt)) != X86EMUL_OKAY ) > goto done; > break; > default: > --- a/xen/arch/x86/x86_emulate/x86_emulate.h > +++ b/xen/arch/x86/x86_emulate/x86_emulate.h > @@ -176,6 +176,14 @@ enum x86_emulate_fpu_type { > X86EMUL_FPU_none > }; > > +enum x86emul_cache_op { > + x86emul_clflush, > + x86emul_clflushopt, > + x86emul_clwb, > + x86emul_invd, > + x86emul_wbinvd, > +}; > + > struct x86_emulate_state; > > /* > @@ -452,8 +460,15 @@ struct x86_emulate_ops > uint64_t val, > struct x86_emulate_ctxt *ctxt); > > - /* wbinvd: Write-back and invalidate cache contents. */ > - int (*wbinvd)( > + /* > + * cache_op: Write-back and/or invalidate cache contents. > + * > + * @seg:@offset applicable only to some of enum x86emul_cache_op. > + */ > + int (*cache_op)( > + enum x86emul_cache_op op, > + enum x86_segment seg, > + unsigned long offset, > struct x86_emulate_ctxt *ctxt); > > /* cpuid: Emulate CPUID via given set of EAX-EDX inputs/outputs. */ > --- a/xen/include/asm-x86/cpufeature.h > +++ b/xen/include/asm-x86/cpufeature.h > @@ -102,6 +102,8 @@ > #define cpu_has_rdseed boot_cpu_has(X86_FEATURE_RDSEED) > #define cpu_has_smap boot_cpu_has(X86_FEATURE_SMAP) > #define cpu_has_avx512_ifma boot_cpu_has(X86_FEATURE_AVX512_IFMA) > +#define cpu_has_clflushopt boot_cpu_has(X86_FEATURE_CLFLUSHOPT) > +#define cpu_has_clwb boot_cpu_has(X86_FEATURE_CLWB) > #define cpu_has_avx512er boot_cpu_has(X86_FEATURE_AVX512ER) > #define cpu_has_avx512cd boot_cpu_has(X86_FEATURE_AVX512CD) > #define cpu_has_sha boot_cpu_has(X86_FEATURE_SHA) > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -21,6 +21,23 @@ static inline void clflush(const void *p > asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); > } > > +static inline void clflushopt(const void *p) > +{ > + asm volatile ( "data16 clflush %0" :: "m" (*(const char *)p) ); > +} > + > +static inline void clwb(const void *p) > +{ > +#if defined(HAVE_AS_CLWB) > + asm volatile ( "clwb %0" :: "m" (*(const char *)p) ); > +#elif defined(HAVE_AS_XSAVEOPT) > + asm volatile ( "data16 xsaveopt %0" :: "m" (*(const char *)p) ); > +#else > + asm volatile ( ".byte 0x66, 0x0f, 0xae, 0x32" > + :: "d" (p), "m" (*(const char *)p) ); > +#endif > +} > + > #define xchg(ptr,v) \ > ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr)))) > > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] Ping: [PATCH 1/6] x86emul: generalize wbinvd() hook 2019-09-02 11:10 ` [Xen-devel] Ping: " Jan Beulich @ 2019-09-02 12:04 ` Paul Durrant 0 siblings, 0 replies; 36+ messages in thread From: Paul Durrant @ 2019-09-02 12:04 UTC (permalink / raw) To: 'Jan Beulich'; +Cc: xen-devel, Roger Pau Monne, Wei Liu, Andrew Cooper > -----Original Message----- > From: Jan Beulich <jbeulich@suse.com> > Sent: 02 September 2019 12:10 > To: Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Wei Liu <wl@xen.org> > Subject: Ping: [PATCH 1/6] x86emul: generalize wbinvd() hook > > On 01.07.2019 13:55, Jan Beulich wrote: > > The hook is already in use for other purposes, and emulating e.g. > > CLFLUSH by issuing WBINVD is, well, not very nice. Rename the hook and > > add parameters. Use lighter weight flushing insns when possible in > > hvmemul_cache_op(). > > > > hvmemul_cache_op() treating x86emul_invd the same as x86emul_wbinvd is > > to retain original behavior, but I'm not sure this is what we want in > > the long run. > > > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Paul, > > any chance I could get your ack (or otherwise) here? I thought I did > answer the one question you had raised to your satisfaction. > Yes, you did. Sorry for not acking earlier. Acked-by: Paul Durrant <paul.durrant@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich @ 2019-07-01 11:56 ` Jan Beulich 2019-07-02 10:38 ` Paul Durrant 2019-08-27 14:47 ` Andrew Cooper 2019-07-01 11:56 ` [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook Jan Beulich ` (4 subsequent siblings) 6 siblings, 2 replies; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:56 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, RogerPau Monne Rev 035 of Intel's ISA extensions document does not state intercept behavior for the insn (I've been in-officially told that the distinction is going to be by exit qualification, as I would have assumed considering that this way it's sufficiently transparent to unaware software, and using WBINVD in place of WBNOINVD is always correct, just less efficient), so in the HVM case for now it'll be backed by the same ->wbinvd_intercept() handlers. Use this occasion and also add the two missing table entries for CLDEMOTE, which doesn't require any further changes to make work. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: Re-base. Convert wbnoinvd() inline function. --- a/tools/libxl/libxl_cpuid.c +++ b/tools/libxl/libxl_cpuid.c @@ -208,6 +208,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"avx512-bitalg",0x00000007, 0, CPUID_REG_ECX, 12, 1}, {"avx512-vpopcntdq",0x00000007,0,CPUID_REG_ECX, 14, 1}, {"rdpid", 0x00000007, 0, CPUID_REG_ECX, 22, 1}, + {"cldemote", 0x00000007, 0, CPUID_REG_ECX, 25, 1}, {"avx512-4vnniw",0x00000007, 0, CPUID_REG_EDX, 2, 1}, {"avx512-4fmaps",0x00000007, 0, CPUID_REG_EDX, 3, 1}, @@ -256,6 +257,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"invtsc", 0x80000007, NA, CPUID_REG_EDX, 8, 1}, + {"wbnoinvd", 0x80000008, NA, CPUID_REG_EBX, 9, 1}, {"ibpb", 0x80000008, NA, CPUID_REG_EBX, 12, 1}, {"nc", 0x80000008, NA, CPUID_REG_ECX, 0, 8}, {"apicidsize", 0x80000008, NA, CPUID_REG_ECX, 12, 4}, --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -146,6 +146,8 @@ static const char *const str_e8b[32] = { [ 0] = "clzero", + /* [ 8] */ [ 9] = "wbnoinvd", + [12] = "ibpb", }; --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2202,6 +2202,7 @@ static int hvmemul_cache_op( /* fall through */ case x86emul_invd: case x86emul_wbinvd: + case x86emul_wbnoinvd: alternative_vcall(hvm_funcs.wbinvd_intercept); break; } --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1121,7 +1121,7 @@ static int write_msr(unsigned int reg, u static int cache_op(enum x86emul_cache_op op, enum x86_segment seg, unsigned long offset, struct x86_emulate_ctxt *ctxt) { - ASSERT(op == x86emul_wbinvd); + ASSERT(op == x86emul_wbinvd || op == x86emul_wbnoinvd); /* Ignore the instruction if unprivileged. */ if ( !cache_flush_permitted(current->domain) ) @@ -1130,6 +1130,8 @@ static int cache_op(enum x86emul_cache_o * newer linux uses this in some start-of-day timing loops. */ ; + else if ( op == x86emul_wbnoinvd && cpu_has_wbnoinvd ) + wbnoinvd(); else wbinvd(); --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1869,6 +1869,7 @@ in_protmode( #define vcpu_has_fma4() (ctxt->cpuid->extd.fma4) #define vcpu_has_tbm() (ctxt->cpuid->extd.tbm) #define vcpu_has_clzero() (ctxt->cpuid->extd.clzero) +#define vcpu_has_wbnoinvd() (ctxt->cpuid->extd.wbnoinvd) #define vcpu_has_bmi1() (ctxt->cpuid->feat.bmi1) #define vcpu_has_hle() (ctxt->cpuid->feat.hle) @@ -5931,10 +5932,13 @@ x86_emulate( break; case X86EMUL_OPC(0x0f, 0x08): /* invd */ - case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */ + case X86EMUL_OPC(0x0f, 0x09): /* wbinvd / wbnoinvd */ generate_exception_if(!mode_ring0(), EXC_GP, 0); fail_if(!ops->cache_op); - if ( (rc = ops->cache_op(b == 0x09 ? x86emul_wbinvd + if ( (rc = ops->cache_op(b == 0x09 ? !repe_prefix() || + !vcpu_has_wbnoinvd() + ? x86emul_wbinvd + : x86emul_wbnoinvd : x86emul_invd, x86_seg_none, 0, ctxt)) != X86EMUL_OKAY ) --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -182,6 +182,7 @@ enum x86emul_cache_op { x86emul_clwb, x86emul_invd, x86emul_wbinvd, + x86emul_wbnoinvd, }; struct x86_emulate_state; --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -129,6 +129,9 @@ #define cpu_has_avx512_4fmaps boot_cpu_has(X86_FEATURE_AVX512_4FMAPS) #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) +/* CPUID level 0x80000008.ebx */ +#define cpu_has_wbnoinvd boot_cpu_has(X86_FEATURE_WBNOINVD) + /* Synthesized. */ #define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON) #define cpu_has_cpuid_faulting boot_cpu_has(X86_FEATURE_CPUID_FAULTING) --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -16,6 +16,11 @@ static inline void wbinvd(void) asm volatile ( "wbinvd" ::: "memory" ); } +static inline void wbnoinvd(void) +{ + asm volatile ( "repe; wbinvd" : : : "memory" ); +} + static inline void clflush(const void *p) { asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -236,6 +236,7 @@ XEN_CPUFEATURE(AVX512_VNNI, 6*32+11) / XEN_CPUFEATURE(AVX512_BITALG, 6*32+12) /*A Support for VPOPCNT[B,W] and VPSHUFBITQMB */ XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */ XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */ +XEN_CPUFEATURE(CLDEMOTE, 6*32+25) /*A CLDEMOTE instruction */ /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */ XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */ @@ -243,6 +244,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ XEN_CPUFEATURE(IBPB, 8*32+12) /*A IBPB support only (no IBRS, used by AMD) */ /* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD 2019-07-01 11:56 ` [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD Jan Beulich @ 2019-07-02 10:38 ` Paul Durrant 2019-08-27 14:47 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Paul Durrant @ 2019-07-02 10:38 UTC (permalink / raw) To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne > -----Original Message----- > From: Jan Beulich <JBeulich@suse.com> > Sent: 01 July 2019 12:57 > To: xen-devel@lists.xenproject.org > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Wei Liu <wl@xen.org> > Subject: [PATCH 2/6] x86emul: support WBNOINVD > > Rev 035 of Intel's ISA extensions document does not state intercept > behavior for the insn (I've been in-officially told that the distinction > is going to be by exit qualification, as I would have assumed > considering that this way it's sufficiently transparent to unaware > software, and using WBINVD in place of WBNOINVD is always correct, just > less efficient), so in the HVM case for now it'll be backed by the same > ->wbinvd_intercept() handlers. > > Use this occasion and also add the two missing table entries for > CLDEMOTE, which doesn't require any further changes to make work. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> > --- > v2: Re-base. Convert wbnoinvd() inline function. > > --- a/tools/libxl/libxl_cpuid.c > +++ b/tools/libxl/libxl_cpuid.c > @@ -208,6 +208,7 @@ int libxl_cpuid_parse_config(libxl_cpuid > {"avx512-bitalg",0x00000007, 0, CPUID_REG_ECX, 12, 1}, > {"avx512-vpopcntdq",0x00000007,0,CPUID_REG_ECX, 14, 1}, > {"rdpid", 0x00000007, 0, CPUID_REG_ECX, 22, 1}, > + {"cldemote", 0x00000007, 0, CPUID_REG_ECX, 25, 1}, > > {"avx512-4vnniw",0x00000007, 0, CPUID_REG_EDX, 2, 1}, > {"avx512-4fmaps",0x00000007, 0, CPUID_REG_EDX, 3, 1}, > @@ -256,6 +257,7 @@ int libxl_cpuid_parse_config(libxl_cpuid > > {"invtsc", 0x80000007, NA, CPUID_REG_EDX, 8, 1}, > > + {"wbnoinvd", 0x80000008, NA, CPUID_REG_EBX, 9, 1}, > {"ibpb", 0x80000008, NA, CPUID_REG_EBX, 12, 1}, > {"nc", 0x80000008, NA, CPUID_REG_ECX, 0, 8}, > {"apicidsize", 0x80000008, NA, CPUID_REG_ECX, 12, 4}, > --- a/tools/misc/xen-cpuid.c > +++ b/tools/misc/xen-cpuid.c > @@ -146,6 +146,8 @@ static const char *const str_e8b[32] = > { > [ 0] = "clzero", > > + /* [ 8] */ [ 9] = "wbnoinvd", > + > [12] = "ibpb", > }; > > --- a/xen/arch/x86/hvm/emulate.c > +++ b/xen/arch/x86/hvm/emulate.c > @@ -2202,6 +2202,7 @@ static int hvmemul_cache_op( > /* fall through */ > case x86emul_invd: > case x86emul_wbinvd: > + case x86emul_wbnoinvd: > alternative_vcall(hvm_funcs.wbinvd_intercept); > break; > } > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -1121,7 +1121,7 @@ static int write_msr(unsigned int reg, u > static int cache_op(enum x86emul_cache_op op, enum x86_segment seg, > unsigned long offset, struct x86_emulate_ctxt *ctxt) > { > - ASSERT(op == x86emul_wbinvd); > + ASSERT(op == x86emul_wbinvd || op == x86emul_wbnoinvd); > > /* Ignore the instruction if unprivileged. */ > if ( !cache_flush_permitted(current->domain) ) > @@ -1130,6 +1130,8 @@ static int cache_op(enum x86emul_cache_o > * newer linux uses this in some start-of-day timing loops. > */ > ; > + else if ( op == x86emul_wbnoinvd && cpu_has_wbnoinvd ) > + wbnoinvd(); > else > wbinvd(); > > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -1869,6 +1869,7 @@ in_protmode( > #define vcpu_has_fma4() (ctxt->cpuid->extd.fma4) > #define vcpu_has_tbm() (ctxt->cpuid->extd.tbm) > #define vcpu_has_clzero() (ctxt->cpuid->extd.clzero) > +#define vcpu_has_wbnoinvd() (ctxt->cpuid->extd.wbnoinvd) > > #define vcpu_has_bmi1() (ctxt->cpuid->feat.bmi1) > #define vcpu_has_hle() (ctxt->cpuid->feat.hle) > @@ -5931,10 +5932,13 @@ x86_emulate( > break; > > case X86EMUL_OPC(0x0f, 0x08): /* invd */ > - case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */ > + case X86EMUL_OPC(0x0f, 0x09): /* wbinvd / wbnoinvd */ > generate_exception_if(!mode_ring0(), EXC_GP, 0); > fail_if(!ops->cache_op); > - if ( (rc = ops->cache_op(b == 0x09 ? x86emul_wbinvd > + if ( (rc = ops->cache_op(b == 0x09 ? !repe_prefix() || > + !vcpu_has_wbnoinvd() > + ? x86emul_wbinvd > + : x86emul_wbnoinvd > : x86emul_invd, > x86_seg_none, 0, > ctxt)) != X86EMUL_OKAY ) > --- a/xen/arch/x86/x86_emulate/x86_emulate.h > +++ b/xen/arch/x86/x86_emulate/x86_emulate.h > @@ -182,6 +182,7 @@ enum x86emul_cache_op { > x86emul_clwb, > x86emul_invd, > x86emul_wbinvd, > + x86emul_wbnoinvd, > }; > > struct x86_emulate_state; > --- a/xen/include/asm-x86/cpufeature.h > +++ b/xen/include/asm-x86/cpufeature.h > @@ -129,6 +129,9 @@ > #define cpu_has_avx512_4fmaps boot_cpu_has(X86_FEATURE_AVX512_4FMAPS) > #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) > > +/* CPUID level 0x80000008.ebx */ > +#define cpu_has_wbnoinvd boot_cpu_has(X86_FEATURE_WBNOINVD) > + > /* Synthesized. */ > #define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON) > #define cpu_has_cpuid_faulting boot_cpu_has(X86_FEATURE_CPUID_FAULTING) > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -16,6 +16,11 @@ static inline void wbinvd(void) > asm volatile ( "wbinvd" ::: "memory" ); > } > > +static inline void wbnoinvd(void) > +{ > + asm volatile ( "repe; wbinvd" : : : "memory" ); > +} > + > static inline void clflush(const void *p) > { > asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); > --- a/xen/include/public/arch-x86/cpufeatureset.h > +++ b/xen/include/public/arch-x86/cpufeatureset.h > @@ -236,6 +236,7 @@ XEN_CPUFEATURE(AVX512_VNNI, 6*32+11) / > XEN_CPUFEATURE(AVX512_BITALG, 6*32+12) /*A Support for VPOPCNT[B,W] and VPSHUFBITQMB */ > XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */ > XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */ > +XEN_CPUFEATURE(CLDEMOTE, 6*32+25) /*A CLDEMOTE instruction */ > > /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */ > XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */ > @@ -243,6 +244,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / > > /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ > XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ > +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ > XEN_CPUFEATURE(IBPB, 8*32+12) /*A IBPB support only (no IBRS, used by AMD) */ > > /* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD 2019-07-01 11:56 ` [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD Jan Beulich 2019-07-02 10:38 ` Paul Durrant @ 2019-08-27 14:47 ` Andrew Cooper 2019-08-27 15:08 ` Jan Beulich 1 sibling, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 14:47 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 01/07/2019 12:56, Jan Beulich wrote: > Rev 035 of Intel's ISA extensions document does not state intercept > behavior for the insn (I've been in-officially told that the distinction unofficially. > is going to be by exit qualification, as I would have assumed > considering that this way it's sufficiently transparent to unaware > software, and using WBINVD in place of WBNOINVD is always correct, just > less efficient), so in the HVM case for now it'll be backed by the same > ->wbinvd_intercept() handlers. It turns out that AMD reuses the WBINVD vmexit code for WBNOINVD, and provide no distinguishing information. There may or may not be an instruction stream to inspect, depending on other errata. I have a question out with AMD as to what to do here, but in the meantime we have no option but to assume WBINVD. > --- a/xen/arch/x86/pv/emul-priv-op.c > +++ b/xen/arch/x86/pv/emul-priv-op.c > @@ -1121,7 +1121,7 @@ static int write_msr(unsigned int reg, u > @@ -1130,6 +1130,8 @@ static int cache_op(enum x86emul_cache_o > * newer linux uses this in some start-of-day timing loops. > */ > ; > + else if ( op == x86emul_wbnoinvd && cpu_has_wbnoinvd ) > + wbnoinvd(); > else > wbinvd(); The cpu_has_wbnoinvd check isn't necessary. The encoding was chosen because it does get interpreted as wbinvd on older processors. Given this property, I expect kernels to perform a blanked s/wbinvd/wbnoinvd/ transformation in appropriate locations, because it is the lowest overhead option to start using this new feature. > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -16,6 +16,11 @@ static inline void wbinvd(void) > asm volatile ( "wbinvd" ::: "memory" ); > } > > +static inline void wbnoinvd(void) > +{ > + asm volatile ( "repe; wbinvd" : : : "memory" ); Semicolon. > +} > + > static inline void clflush(const void *p) > { > asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); > --- a/xen/include/public/arch-x86/cpufeatureset.h > +++ b/xen/include/public/arch-x86/cpufeatureset.h > @@ -243,6 +244,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / > > /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ > XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ > +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ This is implicitly linked with CPUID.8000001d which we don't expose yet. To get the emulation side of things sorted, I'd be happy with this going in without the A for now, and "exposing WBNOINVD to guests" can be a followup task. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD 2019-08-27 14:47 ` Andrew Cooper @ 2019-08-27 15:08 ` Jan Beulich 2019-08-27 16:45 ` Andrew Cooper 0 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-08-27 15:08 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 27.08.2019 16:47, Andrew Cooper wrote: > On 01/07/2019 12:56, Jan Beulich wrote: >> --- a/xen/arch/x86/pv/emul-priv-op.c >> +++ b/xen/arch/x86/pv/emul-priv-op.c >> @@ -1121,7 +1121,7 @@ static int write_msr(unsigned int reg, u >> @@ -1130,6 +1130,8 @@ static int cache_op(enum x86emul_cache_o >> * newer linux uses this in some start-of-day timing loops. >> */ >> ; >> + else if ( op == x86emul_wbnoinvd && cpu_has_wbnoinvd ) >> + wbnoinvd(); >> else >> wbinvd(); > > The cpu_has_wbnoinvd check isn't necessary. The encoding was chosen > because it does get interpreted as wbinvd on older processors. I agree, but wanted to make the code look complete / consistent. Would you be okay with the && ... being retained, but in a comment? >> --- a/xen/include/asm-x86/system.h >> +++ b/xen/include/asm-x86/system.h >> @@ -16,6 +16,11 @@ static inline void wbinvd(void) >> asm volatile ( "wbinvd" ::: "memory" ); >> } >> >> +static inline void wbnoinvd(void) >> +{ >> + asm volatile ( "repe; wbinvd" : : : "memory" ); > > Semicolon. It has to stay, as gas rejects use of REP on insns it doesn't think permit use of REP. H.J. actually proposes even more strict (or should I say hostile) gas behavior, which would then also reject the above construct: https://sourceware.org/ml/binutils/2019-07/msg00186.html >> --- a/xen/include/public/arch-x86/cpufeatureset.h >> +++ b/xen/include/public/arch-x86/cpufeatureset.h >> @@ -243,6 +244,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / >> >> /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ >> XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ >> +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ > > This is implicitly linked with CPUID.8000001d which we don't expose yet. On AMD, but not (so far at least, judging from the SDM) on Intel. > To get the emulation side of things sorted, I'd be happy with this going > in without the A for now, and "exposing WBNOINVD to guests" can be a > followup task. I've dropped the A for now, but as per above I'm not entirely certain that's appropriate; it's certainly the more defensive step. My uncertainty is also because people are free to use the WBNOINVD encoding even without the feature flag set, as it won't #UD (as you also suggest elsewhere in your reply). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD 2019-08-27 15:08 ` Jan Beulich @ 2019-08-27 16:45 ` Andrew Cooper 2019-08-28 11:42 ` Andrew Cooper 0 siblings, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 16:45 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 27/08/2019 16:08, Jan Beulich wrote: > On 27.08.2019 16:47, Andrew Cooper wrote: >> On 01/07/2019 12:56, Jan Beulich wrote: >>> --- a/xen/arch/x86/pv/emul-priv-op.c >>> +++ b/xen/arch/x86/pv/emul-priv-op.c >>> @@ -1121,7 +1121,7 @@ static int write_msr(unsigned int reg, u >>> @@ -1130,6 +1130,8 @@ static int cache_op(enum x86emul_cache_o >>> * newer linux uses this in some start-of-day timing loops. >>> */ >>> ; >>> + else if ( op == x86emul_wbnoinvd && cpu_has_wbnoinvd ) >>> + wbnoinvd(); >>> else >>> wbinvd(); >> >> The cpu_has_wbnoinvd check isn't necessary. The encoding was chosen >> because it does get interpreted as wbinvd on older processors. > > I agree, but wanted to make the code look complete / consistent. > Would you be okay with the && ... being retained, but in a > comment? Ok. > >>> --- a/xen/include/asm-x86/system.h >>> +++ b/xen/include/asm-x86/system.h >>> @@ -16,6 +16,11 @@ static inline void wbinvd(void) >>> asm volatile ( "wbinvd" ::: "memory" ); >>> } >>> +static inline void wbnoinvd(void) >>> +{ >>> + asm volatile ( "repe; wbinvd" : : : "memory" ); >> >> Semicolon. > > It has to stay, as gas rejects use of REP on insns it doesn't think > permit use of REP. H.J. actually proposes even more strict (or should > I say hostile) gas behavior, which would then also reject the above > construct: > https://sourceware.org/ml/binutils/2019-07/msg00186.html Oh. That's dull, especially considering the vendors propensity to create new instructions by using prefixes in this way. > >>> --- a/xen/include/public/arch-x86/cpufeatureset.h >>> +++ b/xen/include/public/arch-x86/cpufeatureset.h >>> @@ -243,6 +244,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / >>> /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word >>> 8 */ >>> XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ >>> +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ >> >> This is implicitly linked with CPUID.8000001d which we don't expose yet. > > On AMD, but not (so far at least, judging from the SDM) on Intel. Intel have leaf 4 which specifies WBINVD's behaviour on different cache levels. TBH, I'd expect to see an adjustment to the leaf 4 documentation to include WBNOINVD. AMD explicitly doesn't have leaf 4. Their leaf 0x8000001d is similar in behaviour (by having subleafs), and is mostly compatible, but irritatingly doesn't have an identical data layout. I've got another query out with AMD because the documentation for this leaf alone claims that CPUID will #UD with an out-of-range subleaf, and I don't believe this can be true. > >> To get the emulation side of things sorted, I'd be happy with this going >> in without the A for now, and "exposing WBNOINVD to guests" can be a >> followup task. > > I've dropped the A for now, but as per above I'm not entirely > certain that's appropriate; it's certainly the more defensive step. > My uncertainty is also because people are free to use the WBNOINVD > encoding even without the feature flag set, as it won't #UD (as you > also suggest elsewhere in your reply). And the emulate behaviour matches, by falling back to WBINVD. I'd prefer to avoid advertising the feature when we have known work still to do, because otherwise we will inevitably forget to do it. Having functionality in the emulator without the feature being advertised is fine in general, because that still behaves like a pipeline which has had CPUID levelled down. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD 2019-08-27 16:45 ` Andrew Cooper @ 2019-08-28 11:42 ` Andrew Cooper 0 siblings, 0 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-28 11:42 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 27/08/2019 17:45, Andrew Cooper wrote: > AMD explicitly doesn't have leaf 4. Their leaf 0x8000001d is similar in > behaviour (by having subleafs), and is mostly compatible, but > irritatingly doesn't have an identical data layout. > > I've got another query out with AMD because the documentation for this > leaf alone claims that CPUID will #UD with an out-of-range subleaf, and > I don't believe this can be true. Current feedback is that this looks like a documentation bug, and that CPUs don't actually behave like that. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich 2019-07-01 11:56 ` [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD Jan Beulich @ 2019-07-01 11:56 ` Jan Beulich 2019-07-02 10:47 ` Paul Durrant 2019-08-27 14:55 ` Andrew Cooper 2019-07-01 11:57 ` [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h Jan Beulich ` (3 subsequent siblings) 6 siblings, 2 replies; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:56 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, RogerPau Monne The hook is already in use for INVLPGA as well. Rename the hook and add parameters. For the moment INVLPGA with a non-zero ASID remains unsupported, but the TODO item gets pushed into the actual hook handler. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -370,16 +370,23 @@ static int fuzz_cmpxchg( return maybe_fail(ctxt, "cmpxchg", true); } -static int fuzz_invlpg( - enum x86_segment seg, - unsigned long offset, +static int fuzz_tlb_op( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, struct x86_emulate_ctxt *ctxt) { - /* invlpg(), unlike all other hooks, may be called with x86_seg_none. */ - assert(is_x86_user_segment(seg) || seg == x86_seg_none); - assert(ctxt->addr_size == 64 || !(offset >> 32)); + switch ( op ) + { + case x86emul_invlpg: + assert(is_x86_user_segment(aux)); + /* fall through */ + case x86emul_invlpga: + assert(ctxt->addr_size == 64 || !(addr >> 32)); + break; + } - return maybe_fail(ctxt, "invlpg", false); + return maybe_fail(ctxt, "TLB-management", false); } static int fuzz_cache_op( @@ -624,7 +631,7 @@ static const struct x86_emulate_ops all_ SET(read_msr), SET(write_msr), SET(cache_op), - SET(invlpg), + SET(tlb_op), .get_fpu = emul_test_get_fpu, .put_fpu = emul_test_put_fpu, .cpuid = emul_test_cpuid, @@ -733,12 +740,12 @@ enum { HOOK_read_msr, HOOK_write_msr, HOOK_cache_op, + HOOK_tlb_op, HOOK_cpuid, HOOK_inject_hw_exception, HOOK_inject_sw_interrupt, HOOK_get_fpu, HOOK_put_fpu, - HOOK_invlpg, HOOK_vmfunc, CANONICALIZE_rip, CANONICALIZE_rsp, @@ -777,9 +784,9 @@ static void disable_hooks(struct x86_emu MAYBE_DISABLE_HOOK(read_msr); MAYBE_DISABLE_HOOK(write_msr); MAYBE_DISABLE_HOOK(cache_op); + MAYBE_DISABLE_HOOK(tlb_op); MAYBE_DISABLE_HOOK(cpuid); MAYBE_DISABLE_HOOK(get_fpu); - MAYBE_DISABLE_HOOK(invlpg); } /* --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2339,36 +2339,53 @@ static void hvmemul_put_fpu( } } -static int hvmemul_invlpg( - enum x86_segment seg, - unsigned long offset, +static int hvmemul_tlb_op( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, struct x86_emulate_ctxt *ctxt) { struct hvm_emulate_ctxt *hvmemul_ctxt = container_of(ctxt, struct hvm_emulate_ctxt, ctxt); - unsigned long addr, reps = 1; - int rc; - - rc = hvmemul_virtual_to_linear( - seg, offset, 1, &reps, hvm_access_none, hvmemul_ctxt, &addr); + unsigned long reps = 1; + int rc = X86EMUL_OKAY; - if ( rc == X86EMUL_EXCEPTION ) + switch ( op ) { - /* - * `invlpg` takes segment bases into account, but is not subject to - * faults from segment type/limit checks, and is specified as a NOP - * when issued on non-canonical addresses. - * - * hvmemul_virtual_to_linear() raises exceptions for type/limit - * violations, so squash them. - */ - x86_emul_reset_event(ctxt); - rc = X86EMUL_OKAY; + case x86emul_invlpg: + rc = hvmemul_virtual_to_linear(aux, addr, 1, &reps, hvm_access_none, + hvmemul_ctxt, &addr); + + if ( rc == X86EMUL_EXCEPTION ) + { + /* + * `invlpg` takes segment bases into account, but is not subject + * to faults from segment type/limit checks, and is specified as + * a NOP when issued on non-canonical addresses. + * + * hvmemul_virtual_to_linear() raises exceptions for type/limit + * violations, so squash them. + */ + x86_emul_reset_event(ctxt); + rc = X86EMUL_OKAY; + } + + if ( rc == X86EMUL_OKAY ) + paging_invlpg(current, addr); + break; + + case x86emul_invlpga: + /* TODO: Support ASIDs. */ + if ( !aux ) + paging_invlpg(current, addr); + else + { + x86_emul_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt); + rc = X86EMUL_EXCEPTION; + } + break; } - if ( rc == X86EMUL_OKAY ) - paging_invlpg(current, addr); - return rc; } @@ -2408,10 +2425,10 @@ static const struct x86_emulate_ops hvm_ .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr, .cache_op = hvmemul_cache_op, + .tlb_op = hvmemul_tlb_op, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, - .invlpg = hvmemul_invlpg, .vmfunc = hvmemul_vmfunc, }; @@ -2435,10 +2452,10 @@ static const struct x86_emulate_ops hvm_ .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr_discard, .cache_op = hvmemul_cache_op_discard, + .tlb_op = hvmemul_tlb_op, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, - .invlpg = hvmemul_invlpg, .vmfunc = hvmemul_vmfunc, }; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -5590,10 +5590,9 @@ x86_emulate( generate_exception_if(!(msr_val & EFER_SVME) || !in_protmode(ctxt, ops), EXC_UD); generate_exception_if(!mode_ring0(), EXC_GP, 0); - generate_exception_if(_regs.ecx, EXC_UD); /* TODO: Support ASIDs. */ - fail_if(ops->invlpg == NULL); - if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.r(ax)), - ctxt)) ) + fail_if(!ops->tlb_op); + if ( (rc = ops->tlb_op(x86emul_invlpga, truncate_ea(_regs.r(ax)), + _regs.ecx, ctxt)) != X86EMUL_OKAY ) goto done; break; @@ -5747,8 +5746,9 @@ x86_emulate( case GRP7_MEM(7): /* invlpg */ ASSERT(ea.type == OP_MEM); generate_exception_if(!mode_ring0(), EXC_GP, 0); - fail_if(ops->invlpg == NULL); - if ( (rc = ops->invlpg(ea.mem.seg, ea.mem.off, ctxt)) ) + fail_if(!ops->tlb_op); + if ( (rc = ops->tlb_op(x86emul_invlpg, ea.mem.off, ea.mem.seg, + ctxt)) != X86EMUL_OKAY ) goto done; break; --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -185,6 +185,11 @@ enum x86emul_cache_op { x86emul_wbnoinvd, }; +enum x86emul_tlb_op { + x86emul_invlpg, + x86emul_invlpga, +}; + struct x86_emulate_state; /* @@ -472,6 +477,19 @@ struct x86_emulate_ops unsigned long offset, struct x86_emulate_ctxt *ctxt); + /* + * tlb_op: Invalidate paging structures which map addressed byte. + * + * @addr and @aux have @op-specific meaning: + * - INVLPG: @aux:@addr represent seg:offset + * - INVLPGA: @addr is the linear address, @aux the ASID + */ + int (*tlb_op)( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, + struct x86_emulate_ctxt *ctxt); + /* cpuid: Emulate CPUID via given set of EAX-EDX inputs/outputs. */ int (*cpuid)( uint32_t leaf, @@ -499,12 +517,6 @@ struct x86_emulate_ops enum x86_emulate_fpu_type backout, const struct x86_emul_fpu_aux *aux); - /* invlpg: Invalidate paging structures which map addressed byte. */ - int (*invlpg)( - enum x86_segment seg, - unsigned long offset, - struct x86_emulate_ctxt *ctxt); - /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */ int (*vmfunc)( struct x86_emulate_ctxt *ctxt); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook 2019-07-01 11:56 ` [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook Jan Beulich @ 2019-07-02 10:47 ` Paul Durrant 2019-08-27 14:55 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Paul Durrant @ 2019-07-02 10:47 UTC (permalink / raw) To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne > -----Original Message----- > From: Jan Beulich <JBeulich@suse.com> > Sent: 01 July 2019 12:57 > To: xen-devel@lists.xenproject.org > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Wei Liu <wl@xen.org> > Subject: [PATCH 3/6] x86emul: generalize invlpg() hook > > The hook is already in use for INVLPGA as well. Rename the hook and add > parameters. For the moment INVLPGA with a non-zero ASID remains > unsupported, but the TODO item gets pushed into the actual hook handler. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook 2019-07-01 11:56 ` [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook Jan Beulich 2019-07-02 10:47 ` Paul Durrant @ 2019-08-27 14:55 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 14:55 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 01/07/2019 12:56, Jan Beulich wrote: > The hook is already in use for INVLPGA as well. Rename the hook and add > parameters. For the moment INVLPGA with a non-zero ASID remains > unsupported, but the TODO item gets pushed into the actual hook handler. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich ` (2 preceding siblings ...) 2019-07-01 11:56 ` [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook Jan Beulich @ 2019-07-01 11:57 ` Jan Beulich 2019-07-02 10:49 ` Paul Durrant 2019-08-27 14:57 ` Andrew Cooper 2019-07-01 11:57 ` [Xen-devel] [PATCH 5/6] x86emul: support INVPCID Jan Beulich ` (2 subsequent siblings) 6 siblings, 2 replies; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:57 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, RogerPau Monne This way the insn emulator can then too use the #define-s. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/xen/include/asm-x86/invpcid.h +++ b/xen/include/asm-x86/invpcid.h @@ -5,11 +5,6 @@ extern bool use_invpcid; -#define INVPCID_TYPE_INDIV_ADDR 0 -#define INVPCID_TYPE_SINGLE_CTXT 1 -#define INVPCID_TYPE_ALL_INCL_GLOBAL 2 -#define INVPCID_TYPE_ALL_NON_GLOBAL 3 - #define INVPCID_OPCODE ".byte 0x66, 0x0f, 0x38, 0x82\n" #define MODRM_ECX_01 ".byte 0x01\n" @@ -38,25 +33,25 @@ static inline void invpcid(unsigned int /* Flush all mappings for a given PCID and addr, not including globals */ static inline void invpcid_flush_one(unsigned int pcid, unsigned long addr) { - invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR); + invpcid(pcid, addr, X86_INVPCID_TYPE_INDIV_ADDR); } /* Flush all mappings for a given PCID, not including globals */ static inline void invpcid_flush_single_context(unsigned int pcid) { - invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT); + invpcid(pcid, 0, X86_INVPCID_TYPE_SINGLE_CTXT); } /* Flush all mappings, including globals, for all PCIDs */ static inline void invpcid_flush_all(void) { - invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL); + invpcid(0, 0, X86_INVPCID_TYPE_ALL_INCL_GLOBAL); } /* Flush all mappings for all PCIDs, excluding globals */ static inline void invpcid_flush_all_nonglobals(void) { - invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); + invpcid(0, 0, X86_INVPCID_TYPE_ALL_NON_GLOBAL); } #endif /* _ASM_X86_INVPCID_H_ */ --- a/xen/include/asm-x86/x86-defns.h +++ b/xen/include/asm-x86/x86-defns.h @@ -108,4 +108,12 @@ */ #define X86_DR7_DEFAULT 0x00000400 /* Default %dr7 value. */ +/* + * Invalidation types for the INVPCID instruction. + */ +#define X86_INVPCID_TYPE_INDIV_ADDR 0 +#define X86_INVPCID_TYPE_SINGLE_CTXT 1 +#define X86_INVPCID_TYPE_ALL_INCL_GLOBAL 2 +#define X86_INVPCID_TYPE_ALL_NON_GLOBAL 3 + #endif /* __XEN_X86_DEFNS_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h 2019-07-01 11:57 ` [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h Jan Beulich @ 2019-07-02 10:49 ` Paul Durrant 2019-08-27 14:57 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Paul Durrant @ 2019-07-02 10:49 UTC (permalink / raw) To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne > -----Original Message----- > From: Jan Beulich <JBeulich@suse.com> > Sent: 01 July 2019 12:57 > To: xen-devel@lists.xenproject.org > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Wei Liu <wl@xen.org> > Subject: [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h > > This way the insn emulator can then too use the #define-s. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h 2019-07-01 11:57 ` [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h Jan Beulich 2019-07-02 10:49 ` Paul Durrant @ 2019-08-27 14:57 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 14:57 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 01/07/2019 12:57, Jan Beulich wrote: > --- a/xen/include/asm-x86/x86-defns.h > +++ b/xen/include/asm-x86/x86-defns.h > @@ -108,4 +108,12 @@ > */ > #define X86_DR7_DEFAULT 0x00000400 /* Default %dr7 value. */ > > +/* > + * Invalidation types for the INVPCID instruction. > + */ > +#define X86_INVPCID_TYPE_INDIV_ADDR 0 > +#define X86_INVPCID_TYPE_SINGLE_CTXT 1 > +#define X86_INVPCID_TYPE_ALL_INCL_GLOBAL 2 > +#define X86_INVPCID_TYPE_ALL_NON_GLOBAL 3 I'd personally take the opportunity to remove the TYPE infix, as it isn't terribly useful. Preferably with this done, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich ` (3 preceding siblings ...) 2019-07-01 11:57 ` [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h Jan Beulich @ 2019-07-01 11:57 ` Jan Beulich 2019-07-02 12:54 ` Paul Durrant 2019-08-27 15:31 ` Andrew Cooper 2019-07-01 11:58 ` [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns Jan Beulich 2019-08-15 14:24 ` [Xen-devel] [PATCH 0/6] x86emul: further work Andrew Cooper 6 siblings, 2 replies; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:57 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, RogerPau Monne Just like for INVLPGA the HVM hook only supports PCID 0 for the time being for individual address invalidation. It also translates the other types to a full flush, which is architecturally permitted and performance-wise presumably not much worse because emulation is slow anyway. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v2: New. --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -382,6 +382,7 @@ static int fuzz_tlb_op( assert(is_x86_user_segment(aux)); /* fall through */ case x86emul_invlpga: + case x86emul_invpcid: assert(ctxt->addr_size == 64 || !(addr >> 32)); break; } --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -684,6 +684,38 @@ static int read_msr( return X86EMUL_UNHANDLEABLE; } +#define INVPCID_ADDR 0x12345678 +#define INVPCID_PCID 0x123 + +static int read_cr_invpcid( + unsigned int reg, + unsigned long *val, + struct x86_emulate_ctxt *ctxt) +{ + int rc = emul_test_read_cr(reg, val, ctxt); + + if ( rc == X86EMUL_OKAY && reg == 4 ) + *val |= X86_CR4_PCIDE; + + return rc; +} + +static int tlb_op_invpcid( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, + struct x86_emulate_ctxt *ctxt) +{ + static unsigned int seq; + + if ( op != x86emul_invpcid || addr != INVPCID_ADDR || + x86emul_invpcid_pcid(aux) != (seq < 4 ? 0 : INVPCID_PCID) || + x86emul_invpcid_type(aux) != (seq++ & 3) ) + return X86EMUL_UNHANDLEABLE; + + return X86EMUL_OKAY; +} + static struct x86_emulate_ops emulops = { .read = read, .insn_fetch = fetch, @@ -4482,6 +4514,46 @@ int main(int argc, char **argv) printf("okay\n"); } else + printf("skipped\n"); + + printf("%-40s", "Testing invpcid 16(%ecx),%%edx..."); + if ( stack_exec ) + { + decl_insn(invpcid); + + asm volatile ( put_insn(invpcid, "invpcid 16(%0), %1") + :: "c" (NULL), "d" (0L) ); + + res[4] = 0; + res[5] = 0; + res[6] = INVPCID_ADDR; + res[7] = 0; + regs.ecx = (unsigned long)res; + emulops.tlb_op = tlb_op_invpcid; + + for ( ; ; ) + { + for ( regs.edx = 0; regs.edx < 4; ++regs.edx ) + { + set_insn(invpcid); + rc = x86_emulate(&ctxt, &emulops); + if ( rc != X86EMUL_OKAY || !check_eip(invpcid) ) + goto fail; + } + + if ( ctxt.addr_size < 64 || res[4] == INVPCID_PCID ) + break; + + emulops.read_cr = read_cr_invpcid; + res[4] = INVPCID_PCID; + } + + emulops.read_cr = emul_test_read_cr; + emulops.tlb_op = NULL; + + printf("okay\n"); + } + else printf("skipped\n"); #undef decl_insn --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -72,6 +72,7 @@ bool emul_test_init(void) * them. */ cp.basic.movbe = true; + cp.feat.invpcid = true; cp.feat.adx = true; cp.feat.avx512pf = cp.feat.avx512f; cp.feat.rdpid = true; @@ -141,7 +142,7 @@ int emul_test_cpuid( */ if ( leaf == 7 && subleaf == 0 ) { - res->b |= 1U << 19; + res->b |= (1U << 10) | (1U << 19); if ( res->b & (1U << 16) ) res->b |= 1U << 26; res->c |= 1U << 22; --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2374,8 +2374,16 @@ static int hvmemul_tlb_op( paging_invlpg(current, addr); break; + case x86emul_invpcid: + if ( x86emul_invpcid_type(aux) != X86_INVPCID_TYPE_INDIV_ADDR ) + { + hvm_asid_flush_vcpu(current); + break; + } + aux = x86emul_invpcid_pcid(aux); + /* fall through */ case x86emul_invlpga: - /* TODO: Support ASIDs. */ + /* TODO: Support ASIDs/PCIDs. */ if ( !aux ) paging_invlpg(current, addr); else --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -496,6 +496,7 @@ static const struct ext0f38_table { [0x7a ... 0x7c] = { .simd_size = simd_none, .two_op = 1 }, [0x7d ... 0x7e] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x7f] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, + [0x82] = { .simd_size = simd_other }, [0x83] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x88] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_dq }, [0x89] = { .simd_size = simd_packed_int, .two_op = 1, .d8s = d8s_dq }, @@ -1875,6 +1876,7 @@ in_protmode( #define vcpu_has_hle() (ctxt->cpuid->feat.hle) #define vcpu_has_avx2() (ctxt->cpuid->feat.avx2) #define vcpu_has_bmi2() (ctxt->cpuid->feat.bmi2) +#define vcpu_has_invpcid() (ctxt->cpuid->feat.invpcid) #define vcpu_has_rtm() (ctxt->cpuid->feat.rtm) #define vcpu_has_mpx() (ctxt->cpuid->feat.mpx) #define vcpu_has_avx512f() (ctxt->cpuid->feat.avx512f) @@ -9124,6 +9126,48 @@ x86_emulate( ASSERT(!state->simd_size); break; + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ + vcpu_must_have(invpcid); + generate_exception_if(ea.type != OP_MEM, EXC_UD); + generate_exception_if(!mode_ring0(), EXC_GP, 0); + + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, + ctxt)) != X86EMUL_OKAY ) + goto done; + + generate_exception_if(mmvalp->xmm[0] & ~0xfff, EXC_GP, 0); + dst.val = mode_64bit() ? *dst.reg : (uint32_t)*dst.reg; + + switch ( dst.val ) + { + case X86_INVPCID_TYPE_INDIV_ADDR: + generate_exception_if(!is_canonical_address(mmvalp->xmm[1]), + EXC_GP, 0); + /* fall through */ + case X86_INVPCID_TYPE_SINGLE_CTXT: + if ( !mode_64bit() || !ops->read_cr ) + cr4 = 0; + else if ( (rc = ops->read_cr(4, &cr4, ctxt)) != X86EMUL_OKAY ) + goto done; + generate_exception_if(!(cr4 & X86_CR4_PCIDE) && mmvalp->xmm[0], + EXC_GP, 0); + break; + case X86_INVPCID_TYPE_ALL_INCL_GLOBAL: + case X86_INVPCID_TYPE_ALL_NON_GLOBAL: + break; + default: + generate_exception(EXC_GP, 0); + } + + fail_if(!ops->tlb_op); + if ( (rc = ops->tlb_op(x86emul_invpcid, truncate_ea(mmvalp->xmm[1]), + x86emul_invpcid_aux(mmvalp->xmm[0], dst.val), + ctxt)) != X86EMUL_OKAY ) + goto done; + + state->simd_size = simd_none; + break; + case X86EMUL_OPC_EVEX_66(0x0f38, 0x83): /* vpmultishiftqb [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ generate_exception_if(!evex.w, EXC_UD); host_and_vcpu_must_have(avx512_vbmi); --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -188,8 +188,26 @@ enum x86emul_cache_op { enum x86emul_tlb_op { x86emul_invlpg, x86emul_invlpga, + x86emul_invpcid, }; +static inline unsigned int x86emul_invpcid_aux(unsigned int pcid, + unsigned int type) +{ + ASSERT(!(pcid & ~0xfff)); + return (type << 12) | pcid; +} + +static inline unsigned int x86emul_invpcid_pcid(unsigned int aux) +{ + return aux & 0xfff; +} + +static inline unsigned int x86emul_invpcid_type(unsigned int aux) +{ + return aux >> 12; +} + struct x86_emulate_state; /* @@ -483,6 +501,8 @@ struct x86_emulate_ops * @addr and @aux have @op-specific meaning: * - INVLPG: @aux:@addr represent seg:offset * - INVLPGA: @addr is the linear address, @aux the ASID + * - INVPCID: @addr is the linear address, @aux the combination of + * PCID and type (see x86emul_invpcid_*()). */ int (*tlb_op)( enum x86emul_tlb_op op, _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-07-01 11:57 ` [Xen-devel] [PATCH 5/6] x86emul: support INVPCID Jan Beulich @ 2019-07-02 12:54 ` Paul Durrant 2019-08-27 15:31 ` Andrew Cooper 1 sibling, 0 replies; 36+ messages in thread From: Paul Durrant @ 2019-07-02 12:54 UTC (permalink / raw) To: 'Jan Beulich', xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne > -----Original Message----- > From: Jan Beulich <JBeulich@suse.com> > Sent: 01 July 2019 12:58 > To: xen-devel@lists.xenproject.org > Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>; Paul Durrant <Paul.Durrant@citrix.com>; Roger Pau Monne > <roger.pau@citrix.com>; Wei Liu <wl@xen.org> > Subject: [PATCH 5/6] x86emul: support INVPCID > > Just like for INVLPGA the HVM hook only supports PCID 0 for the time > being for individual address invalidation. It also translates the other > types to a full flush, which is architecturally permitted and > performance-wise presumably not much worse because emulation is slow > anyway. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> hvm/emulate bits... Reviewed-by: Paul Durrant <paul.durrant@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-07-01 11:57 ` [Xen-devel] [PATCH 5/6] x86emul: support INVPCID Jan Beulich 2019-07-02 12:54 ` Paul Durrant @ 2019-08-27 15:31 ` Andrew Cooper 2019-08-27 15:53 ` Jan Beulich 1 sibling, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 15:31 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 01/07/2019 12:57, Jan Beulich wrote: > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -9124,6 +9126,48 @@ x86_emulate( > ASSERT(!state->simd_size); > break; > > + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ > + vcpu_must_have(invpcid); > + generate_exception_if(ea.type != OP_MEM, EXC_UD); > + generate_exception_if(!mode_ring0(), EXC_GP, 0); > + > + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, > + ctxt)) != X86EMUL_OKAY ) > + goto done; The actual behaviour in hardware is to not even read the memory operand if it is unused. You can demonstrate this by doing an ALL_INC_GLOBAL flush with a non-canonical memory operand. In particular, I was intending to use this behaviour to speed up handling of INV{EPT,VPID} which trap unconditionally. However, this is how the instruction is described in the SDM, and INVPCID should usually execute without trapping, so the unconditional read should be fine. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-08-27 15:31 ` Andrew Cooper @ 2019-08-27 15:53 ` Jan Beulich 2019-08-27 17:27 ` Andrew Cooper 0 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-08-27 15:53 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 27.08.2019 17:31, Andrew Cooper wrote: > On 01/07/2019 12:57, Jan Beulich wrote: >> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >> @@ -9124,6 +9126,48 @@ x86_emulate( >> ASSERT(!state->simd_size); >> break; >> >> + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ >> + vcpu_must_have(invpcid); >> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >> + generate_exception_if(!mode_ring0(), EXC_GP, 0); >> + >> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, >> + ctxt)) != X86EMUL_OKAY ) >> + goto done; > > The actual behaviour in hardware is to not even read the memory operand > if it is unused. You can demonstrate this by doing an ALL_INC_GLOBAL > flush with a non-canonical memory operand. Oh, that's sort of unexpected. > In particular, I was > intending to use this behaviour to speed up handling of INV{EPT,VPID} > which trap unconditionally. Which would require the observed behavior to also be the SDM mandated one, wouldn't it? > However, this is how the instruction is described in the SDM, and > INVPCID should usually execute without trapping, so the unconditional > read should be fine. > > Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Thanks. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-08-27 15:53 ` Jan Beulich @ 2019-08-27 17:27 ` Andrew Cooper 2019-08-28 7:14 ` Jan Beulich 0 siblings, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 17:27 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 27/08/2019 16:53, Jan Beulich wrote: > On 27.08.2019 17:31, Andrew Cooper wrote: >> On 01/07/2019 12:57, Jan Beulich wrote: >>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >>> @@ -9124,6 +9126,48 @@ x86_emulate( >>> ASSERT(!state->simd_size); >>> break; >>> + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ >>> + vcpu_must_have(invpcid); >>> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >>> + generate_exception_if(!mode_ring0(), EXC_GP, 0); >>> + >>> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, >>> + ctxt)) != X86EMUL_OKAY ) >>> + goto done; >> >> The actual behaviour in hardware is to not even read the memory operand >> if it is unused. You can demonstrate this by doing an ALL_INC_GLOBAL >> flush with a non-canonical memory operand. > > Oh, that's sort of unexpected. It makes sense as an optimisation. There is no point fetching a memory operand if you're not going to use it. Furthermore, it almost certainly reduces the microcode complexity. > >> In particular, I was >> intending to use this behaviour to speed up handling of INV{EPT,VPID} >> which trap unconditionally. > > Which would require the observed behavior to also be the SDM > mandated one, wouldn't it? If you recall, we discussed this with Jun in Budapest. His opinion was no instructions go out of their way to check properties which don't matter - it is just that it is far more obvious with instructions like these where the complexity is much greater. No production systems are going to rely on getting faults, because taking a fault doesn't produce useful work. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-08-27 17:27 ` Andrew Cooper @ 2019-08-28 7:14 ` Jan Beulich 2019-08-28 11:33 ` Andrew Cooper 0 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-08-28 7:14 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, PaulDurrant, Wei Liu, RogerPau Monne On 27.08.2019 19:27, Andrew Cooper wrote: > On 27/08/2019 16:53, Jan Beulich wrote: >> On 27.08.2019 17:31, Andrew Cooper wrote: >>> On 01/07/2019 12:57, Jan Beulich wrote: >>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >>>> @@ -9124,6 +9126,48 @@ x86_emulate( >>>> ASSERT(!state->simd_size); >>>> break; >>>> + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ >>>> + vcpu_must_have(invpcid); >>>> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >>>> + generate_exception_if(!mode_ring0(), EXC_GP, 0); >>>> + >>>> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, >>>> + ctxt)) != X86EMUL_OKAY ) >>>> + goto done; >>> >>> The actual behaviour in hardware is to not even read the memory operand >>> if it is unused. You can demonstrate this by doing an ALL_INC_GLOBAL >>> flush with a non-canonical memory operand. >> >> Oh, that's sort of unexpected. > > It makes sense as an optimisation. There is no point fetching a memory > operand if you're not going to use it. > > Furthermore, it almost certainly reduces the microcode complexity. Probably. For comparison I had been thinking of 0-bit shifts instead, which do read their memory operands. Even SHLD/SHRD, which at least with shift count in %cl look to be uniformly microcoded, access their memory operand in this case. >>> In particular, I was >>> intending to use this behaviour to speed up handling of INV{EPT,VPID} >>> which trap unconditionally. >> >> Which would require the observed behavior to also be the SDM >> mandated one, wouldn't it? > > If you recall, we discussed this with Jun in Budapest. His opinion was > no instructions go out of their way to check properties which don't > matter - it is just that it is far more obvious with instructions like > these where the complexity is much greater. > > No production systems are going to rely on getting faults, because > taking a fault doesn't produce useful work. Maybe I misunderstood your earlier reply then: I read it to mean you want to leverage INVPCID not faulting on "bad" memory operands for flush types not using the memory operand. But perhaps you merely meant you want to leverage the insn not _accessing_ its memory operand in this case? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 5/6] x86emul: support INVPCID 2019-08-28 7:14 ` Jan Beulich @ 2019-08-28 11:33 ` Andrew Cooper 0 siblings, 0 replies; 36+ messages in thread From: Andrew Cooper @ 2019-08-28 11:33 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, PaulDurrant, Wei Liu, RogerPau Monne On 28/08/2019 08:14, Jan Beulich wrote: > On 27.08.2019 19:27, Andrew Cooper wrote: >> On 27/08/2019 16:53, Jan Beulich wrote: >>> On 27.08.2019 17:31, Andrew Cooper wrote: >>>> On 01/07/2019 12:57, Jan Beulich wrote: >>>>> --- a/xen/arch/x86/x86_emulate/x86_emulate.c >>>>> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c >>>>> @@ -9124,6 +9126,48 @@ x86_emulate( >>>>> ASSERT(!state->simd_size); >>>>> break; >>>>> + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ >>>>> + vcpu_must_have(invpcid); >>>>> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >>>>> + generate_exception_if(!mode_ring0(), EXC_GP, 0); >>>>> + >>>>> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, >>>>> + ctxt)) != X86EMUL_OKAY ) >>>>> + goto done; >>>> >>>> The actual behaviour in hardware is to not even read the memory >>>> operand >>>> if it is unused. You can demonstrate this by doing an ALL_INC_GLOBAL >>>> flush with a non-canonical memory operand. >>> >>> Oh, that's sort of unexpected. >> >> It makes sense as an optimisation. There is no point fetching a memory >> operand if you're not going to use it. >> >> Furthermore, it almost certainly reduces the microcode complexity. > > Probably. For comparison I had been thinking of 0-bit shifts instead, > which do read their memory operands. Even SHLD/SHRD, which at least > with shift count in %cl look to be uniformly microcoded, access their > memory operand in this case. Again, that isn't surprising to me. You will never see a shift by 0 anywhere but a test suite, because it is wasted effort. Therefore, any attempt to special case 0 will reduce performance in all production scenarios. SHLD/SHRD's microcoded-ness comes from having to construct a double width rotate. In the worst case, this is two independent rotate uops issued into the pipeline, and enough ALU logic to combine the results. If you observe, some CPUs have the 32bit versions non-microcoded, which will probably be the frontend converting up to a 64bit uop internally. INV{PCID,EPT,VPID} are all conceptually of the form: switch ( reg ) { ... construct tlb uop. } dispatch tlb uop. and avoiding one or two memory reads will make a meaningful performance improvement. > >>>> In particular, I was >>>> intending to use this behaviour to speed up handling of INV{EPT,VPID} >>>> which trap unconditionally. >>> >>> Which would require the observed behavior to also be the SDM >>> mandated one, wouldn't it? >> >> If you recall, we discussed this with Jun in Budapest. His opinion was >> no instructions go out of their way to check properties which don't >> matter - it is just that it is far more obvious with instructions like >> these where the complexity is much greater. >> >> No production systems are going to rely on getting faults, because >> taking a fault doesn't produce useful work. > > Maybe I misunderstood your earlier reply then: I read it to mean you > want to leverage INVPCID not faulting on "bad" memory operands for > flush types not using the memory operand. But perhaps you merely > meant you want to leverage the insn not _accessing_ its memory > operand in this case? Correct. Its to avoid unnecessary page walks. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich ` (4 preceding siblings ...) 2019-07-01 11:57 ` [Xen-devel] [PATCH 5/6] x86emul: support INVPCID Jan Beulich @ 2019-07-01 11:58 ` Jan Beulich 2019-08-27 16:04 ` Andrew Cooper 2019-08-15 14:24 ` [Xen-devel] [PATCH 0/6] x86emul: further work Andrew Cooper 6 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-07-01 11:58 UTC (permalink / raw) To: xen-devel; +Cc: Andrew Cooper, Paul Durrant, Wei Liu, RogerPau Monne Note that the ISA extensions document revision 035 doesn't specify exception behavior for ModRM.mod != 0b11; assuming #UD here. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -548,6 +548,8 @@ static const struct ext0f38_table { [0xf1] = { .to_mem = 1, .two_op = 1 }, [0xf2 ... 0xf3] = {}, [0xf5 ... 0xf7] = {}, + [0xf8] = { .simd_size = simd_other }, + [0xf9] = { .to_mem = 1 }, }; /* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */ @@ -1902,6 +1904,8 @@ in_protmode( #define vcpu_has_avx512_bitalg() (ctxt->cpuid->feat.avx512_bitalg) #define vcpu_has_avx512_vpopcntdq() (ctxt->cpuid->feat.avx512_vpopcntdq) #define vcpu_has_rdpid() (ctxt->cpuid->feat.rdpid) +#define vcpu_has_movdiri() (ctxt->cpuid->feat.movdiri) +#define vcpu_has_movdir64b() (ctxt->cpuid->feat.movdir64b) #define vcpu_has_avx512_4vnniw() (ctxt->cpuid->feat.avx512_4vnniw) #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps) @@ -2693,10 +2697,12 @@ x86_decode_0f38( { case 0x00 ... 0xef: case 0xf2 ... 0xf5: - case 0xf7 ... 0xff: + case 0xf7 ... 0xf8: + case 0xfa ... 0xff: op_bytes = 0; /* fall through */ case 0xf6: /* adcx / adox */ + case 0xf9: /* movdiri */ ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK); break; @@ -9896,6 +9902,32 @@ x86_emulate( : "0" ((uint32_t)src.val), "rm" (_regs.edx) ); break; + case X86EMUL_OPC_66(0x0f38, 0xf8): /* movdir64b r,m512 */ + vcpu_must_have(movdir64b); + generate_exception_if(ea.type != OP_MEM, EXC_UD); + src.val = truncate_ea(*dst.reg); + generate_exception_if(!is_aligned(x86_seg_es, src.val, 64, ctxt, ops), + EXC_GP, 0); + /* Ignore the non-temporal behavior for now. */ + fail_if(!ops->write); + BUILD_BUG_ON(sizeof(*mmvalp) < 64); + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 64, + ctxt)) != X86EMUL_OKAY || + (rc = ops->write(x86_seg_es, src.val, mmvalp, 64, + ctxt)) != X86EMUL_OKAY ) + goto done; + state->simd_size = simd_none; + sfence = true; + break; + + case X86EMUL_OPC(0x0f38, 0xf9): /* movdiri mem,r */ + vcpu_must_have(movdiri); + generate_exception_if(dst.type != OP_MEM, EXC_UD); + /* Ignore the non-temporal behavior for now. */ + dst.val = src.val; + sfence = true; + break; + case X86EMUL_OPC_VEX_66(0x0f3a, 0x00): /* vpermq $imm8,ymm/m256,ymm */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x01): /* vpermpd $imm8,ymm/m256,ymm */ generate_exception_if(!vex.l || !vex.w, EXC_UD); --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -237,6 +237,8 @@ XEN_CPUFEATURE(AVX512_BITALG, 6*32+12) / XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */ XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */ XEN_CPUFEATURE(CLDEMOTE, 6*32+25) /*A CLDEMOTE instruction */ +XEN_CPUFEATURE(MOVDIRI, 6*32+27) /*A MOVDIRI instruction */ +XEN_CPUFEATURE(MOVDIR64B, 6*32+28) /*A MOVDIR64B instruction */ /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */ XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */ --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -2196,6 +2196,36 @@ int main(int argc, char **argv) goto fail; printf("okay\n"); + printf("%-40s", "Testing movdiri %edx,(%ecx)..."); + instr[0] = 0x0f; instr[1] = 0x38; instr[2] = 0xf9; instr[3] = 0x11; + regs.eip = (unsigned long)&instr[0]; + regs.ecx = (unsigned long)memset(res, -1, 16); + regs.edx = 0x44332211; + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || + (regs.eip != (unsigned long)&instr[4]) || + res[0] != 0x44332211 || ~res[1] ) + goto fail; + printf("okay\n"); + + printf("%-40s", "Testing movdir64b 144(%edx),%ecx..."); + instr[0] = 0x66; instr[1] = 0x0f; instr[2] = 0x38; instr[3] = 0xf8; + instr[4] = 0x8a; instr[5] = 0x90; instr[8] = instr[7] = instr[6] = 0; + regs.eip = (unsigned long)&instr[0]; + for ( i = 0; i < 64; ++i ) + res[i] = i - 20; + regs.edx = (unsigned long)res; + regs.ecx = (unsigned long)(res + 16); + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || + (regs.eip != (unsigned long)&instr[9]) || + res[15] != -5 || res[32] != 12 ) + goto fail; + for ( i = 16; i < 32; ++i ) + if ( res[i] != i ) + goto fail; + printf("okay\n"); + printf("%-40s", "Testing movq %mm3,(%ecx)..."); if ( stack_exec && cpu_has_mmx ) { --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -76,6 +76,8 @@ bool emul_test_init(void) cp.feat.adx = true; cp.feat.avx512pf = cp.feat.avx512f; cp.feat.rdpid = true; + cp.feat.movdiri = true; + cp.feat.movdir64b = true; cp.extd.clzero = true; if ( cpu_has_xsave ) @@ -137,15 +139,15 @@ int emul_test_cpuid( res->c |= 1U << 22; /* - * The emulator doesn't itself use ADCX/ADOX/RDPID nor the S/G prefetch - * insns, so we can always run the respective tests. + * The emulator doesn't itself use ADCX/ADOX/RDPID/MOVDIR* nor the S/G + * prefetch insns, so we can always run the respective tests. */ if ( leaf == 7 && subleaf == 0 ) { res->b |= (1U << 10) | (1U << 19); if ( res->b & (1U << 16) ) res->b |= 1U << 26; - res->c |= 1U << 22; + res->c |= (1U << 22) | (1U << 27) | (1U << 28); } /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns 2019-07-01 11:58 ` [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns Jan Beulich @ 2019-08-27 16:04 ` Andrew Cooper 2019-08-28 6:26 ` Jan Beulich 0 siblings, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-27 16:04 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, RogerPau Monne On 01/07/2019 12:58, Jan Beulich wrote: > Note that the ISA extensions document revision 035 doesn't specify > exception behavior for ModRM.mod != 0b11; assuming #UD here. This has moved into the main SDM now. These instructions don't make sense with reg/reg encodings, so I expect that encoding space will be reused as a new group at some point in the future. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/x86_emulate/x86_emulate.c > +++ b/xen/arch/x86/x86_emulate/x86_emulate.c > @@ -548,6 +548,8 @@ static const struct ext0f38_table { > [0xf1] = { .to_mem = 1, .two_op = 1 }, > [0xf2 ... 0xf3] = {}, > [0xf5 ... 0xf7] = {}, > + [0xf8] = { .simd_size = simd_other }, > + [0xf9] = { .to_mem = 1 }, > }; > > /* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */ > @@ -1902,6 +1904,8 @@ in_protmode( > #define vcpu_has_avx512_bitalg() (ctxt->cpuid->feat.avx512_bitalg) > #define vcpu_has_avx512_vpopcntdq() (ctxt->cpuid->feat.avx512_vpopcntdq) > #define vcpu_has_rdpid() (ctxt->cpuid->feat.rdpid) > +#define vcpu_has_movdiri() (ctxt->cpuid->feat.movdiri) > +#define vcpu_has_movdir64b() (ctxt->cpuid->feat.movdir64b) > #define vcpu_has_avx512_4vnniw() (ctxt->cpuid->feat.avx512_4vnniw) > #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps) > > @@ -2693,10 +2697,12 @@ x86_decode_0f38( > { > case 0x00 ... 0xef: > case 0xf2 ... 0xf5: > - case 0xf7 ... 0xff: > + case 0xf7 ... 0xf8: > + case 0xfa ... 0xff: > op_bytes = 0; > /* fall through */ > case 0xf6: /* adcx / adox */ > + case 0xf9: /* movdiri */ > ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK); > break; > > @@ -9896,6 +9902,32 @@ x86_emulate( > : "0" ((uint32_t)src.val), "rm" (_regs.edx) ); > break; > > + case X86EMUL_OPC_66(0x0f38, 0xf8): /* movdir64b r,m512 */ > + vcpu_must_have(movdir64b); > + generate_exception_if(ea.type != OP_MEM, EXC_UD); > + src.val = truncate_ea(*dst.reg); > + generate_exception_if(!is_aligned(x86_seg_es, src.val, 64, ctxt, ops), > + EXC_GP, 0); > + /* Ignore the non-temporal behavior for now. */ > + fail_if(!ops->write); > + BUILD_BUG_ON(sizeof(*mmvalp) < 64); > + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 64, > + ctxt)) != X86EMUL_OKAY || > + (rc = ops->write(x86_seg_es, src.val, mmvalp, 64, > + ctxt)) != X86EMUL_OKAY ) > + goto done; > + state->simd_size = simd_none; > + sfence = true; > + break; > + > + case X86EMUL_OPC(0x0f38, 0xf9): /* movdiri mem,r */ > + vcpu_must_have(movdiri); > + generate_exception_if(dst.type != OP_MEM, EXC_UD); > + /* Ignore the non-temporal behavior for now. */ > + dst.val = src.val; > + sfence = true; > + break; I'm not certain this gives the required atomicity. AFAICT, it degrades into ops->write(), which can end up with bytewise writes. I think we need to map the destination and issue an explicit mov instruction. > --- a/tools/tests/x86_emulator/x86-emulate.c > +++ b/tools/tests/x86_emulator/x86-emulate.c > @@ -76,6 +76,8 @@ bool emul_test_init(void) > cp.feat.adx = true; > cp.feat.avx512pf = cp.feat.avx512f; > cp.feat.rdpid = true; > + cp.feat.movdiri = true; > + cp.feat.movdir64b = true; > cp.extd.clzero = true; > > if ( cpu_has_xsave ) > @@ -137,15 +139,15 @@ int emul_test_cpuid( > res->c |= 1U << 22; > > /* > - * The emulator doesn't itself use ADCX/ADOX/RDPID nor the S/G prefetch > - * insns, so we can always run the respective tests. > + * The emulator doesn't itself use ADCX/ADOX/RDPID/MOVDIR* nor the S/G > + * prefetch insns, so we can always run the respective tests. > */ > if ( leaf == 7 && subleaf == 0 ) > { > res->b |= (1U << 10) | (1U << 19); > if ( res->b & (1U << 16) ) > res->b |= 1U << 26; > - res->c |= 1U << 22; > + res->c |= (1U << 22) | (1U << 27) | (1U << 28); I've just noticed, but we shouldn't be having both the named variables and these unnamed ones. Is their presence a rebasing issue around patches into the test suite? ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns 2019-08-27 16:04 ` Andrew Cooper @ 2019-08-28 6:26 ` Jan Beulich 2019-08-28 11:51 ` Andrew Cooper 0 siblings, 1 reply; 36+ messages in thread From: Jan Beulich @ 2019-08-28 6:26 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 27.08.2019 18:04, Andrew Cooper wrote: > On 01/07/2019 12:58, Jan Beulich wrote: >> @@ -9896,6 +9902,32 @@ x86_emulate( >> : "0" ((uint32_t)src.val), "rm" (_regs.edx) ); >> break; >> >> + case X86EMUL_OPC_66(0x0f38, 0xf8): /* movdir64b r,m512 */ >> + vcpu_must_have(movdir64b); >> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >> + src.val = truncate_ea(*dst.reg); >> + generate_exception_if(!is_aligned(x86_seg_es, src.val, 64, ctxt, ops), >> + EXC_GP, 0); >> + /* Ignore the non-temporal behavior for now. */ >> + fail_if(!ops->write); >> + BUILD_BUG_ON(sizeof(*mmvalp) < 64); >> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 64, >> + ctxt)) != X86EMUL_OKAY || >> + (rc = ops->write(x86_seg_es, src.val, mmvalp, 64, >> + ctxt)) != X86EMUL_OKAY ) >> + goto done; >> + state->simd_size = simd_none; >> + sfence = true; >> + break; >> + >> + case X86EMUL_OPC(0x0f38, 0xf9): /* movdiri mem,r */ >> + vcpu_must_have(movdiri); >> + generate_exception_if(dst.type != OP_MEM, EXC_UD); >> + /* Ignore the non-temporal behavior for now. */ >> + dst.val = src.val; >> + sfence = true; >> + break; > > I'm not certain this gives the required atomicity. AFAICT, it degrades > into ops->write(), which can end up with bytewise writes. > > I think we need to map the destination and issue an explicit mov > instruction. I don't think so, no - plain MOV has the same property (in particular when not going through the cache), and also uses the ->write() hook. It's the hook function that needs to behave properly for all of this to be correct. >> --- a/tools/tests/x86_emulator/x86-emulate.c >> +++ b/tools/tests/x86_emulator/x86-emulate.c >> @@ -76,6 +76,8 @@ bool emul_test_init(void) >> cp.feat.adx = true; >> cp.feat.avx512pf = cp.feat.avx512f; >> cp.feat.rdpid = true; >> + cp.feat.movdiri = true; >> + cp.feat.movdir64b = true; >> cp.extd.clzero = true; >> >> if ( cpu_has_xsave ) >> @@ -137,15 +139,15 @@ int emul_test_cpuid( >> res->c |= 1U << 22; >> >> /* >> - * The emulator doesn't itself use ADCX/ADOX/RDPID nor the S/G prefetch >> - * insns, so we can always run the respective tests. >> + * The emulator doesn't itself use ADCX/ADOX/RDPID/MOVDIR* nor the S/G >> + * prefetch insns, so we can always run the respective tests. >> */ >> if ( leaf == 7 && subleaf == 0 ) >> { >> res->b |= (1U << 10) | (1U << 19); >> if ( res->b & (1U << 16) ) >> res->b |= 1U << 26; >> - res->c |= 1U << 22; >> + res->c |= (1U << 22) | (1U << 27) | (1U << 28); > > I've just noticed, but we shouldn't be having both the named variables > and these unnamed ones. Is their presence a rebasing issue around > patches into the test suite? emul_test_init() gained its current shape from fd35f32b4b, while emul_test_cpuid() had been left untouched at that point. So I guess it's more like a forgotten piece of conversion work. I'm unsure though whether such a conversion is actually a good idea, since aiui it would mean cloning at least guest_cpuid()'s first switch() into this function, which is quite a bit more code than there is right now. Perhaps (the common part of) that switch() could be morphed into a library function ... Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns 2019-08-28 6:26 ` Jan Beulich @ 2019-08-28 11:51 ` Andrew Cooper 2019-08-28 12:19 ` Jan Beulich 0 siblings, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-28 11:51 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Paul Durrant, Wei Liu, RogerPau Monne On 28/08/2019 07:26, Jan Beulich wrote: > On 27.08.2019 18:04, Andrew Cooper wrote: >> On 01/07/2019 12:58, Jan Beulich wrote: >>> @@ -9896,6 +9902,32 @@ x86_emulate( >>> : "0" ((uint32_t)src.val), "rm" >>> (_regs.edx) ); >>> break; >>> + case X86EMUL_OPC_66(0x0f38, 0xf8): /* movdir64b r,m512 */ >>> + vcpu_must_have(movdir64b); >>> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >>> + src.val = truncate_ea(*dst.reg); >>> + generate_exception_if(!is_aligned(x86_seg_es, src.val, 64, >>> ctxt, ops), >>> + EXC_GP, 0); >>> + /* Ignore the non-temporal behavior for now. */ >>> + fail_if(!ops->write); >>> + BUILD_BUG_ON(sizeof(*mmvalp) < 64); >>> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 64, >>> + ctxt)) != X86EMUL_OKAY || >>> + (rc = ops->write(x86_seg_es, src.val, mmvalp, 64, >>> + ctxt)) != X86EMUL_OKAY ) >>> + goto done; >>> + state->simd_size = simd_none; >>> + sfence = true; >>> + break; >>> + >>> + case X86EMUL_OPC(0x0f38, 0xf9): /* movdiri mem,r */ >>> + vcpu_must_have(movdiri); >>> + generate_exception_if(dst.type != OP_MEM, EXC_UD); >>> + /* Ignore the non-temporal behavior for now. */ >>> + dst.val = src.val; >>> + sfence = true; >>> + break; >> >> I'm not certain this gives the required atomicity. AFAICT, it degrades >> into ops->write(), which can end up with bytewise writes. >> >> I think we need to map the destination and issue an explicit mov >> instruction. > > I don't think so, no - plain MOV has the same property (in particular > when not going through the cache), and also uses the ->write() hook. > It's the hook function that needs to behave properly for all of this > to be correct. It only occurred to me after sending this email that plain MOV was broken as well. > >>> --- a/tools/tests/x86_emulator/x86-emulate.c >>> +++ b/tools/tests/x86_emulator/x86-emulate.c >>> @@ -76,6 +76,8 @@ bool emul_test_init(void) >>> cp.feat.adx = true; >>> cp.feat.avx512pf = cp.feat.avx512f; >>> cp.feat.rdpid = true; >>> + cp.feat.movdiri = true; >>> + cp.feat.movdir64b = true; >>> cp.extd.clzero = true; >>> if ( cpu_has_xsave ) >>> @@ -137,15 +139,15 @@ int emul_test_cpuid( >>> res->c |= 1U << 22; >>> /* >>> - * The emulator doesn't itself use ADCX/ADOX/RDPID nor the S/G >>> prefetch >>> - * insns, so we can always run the respective tests. >>> + * The emulator doesn't itself use ADCX/ADOX/RDPID/MOVDIR* nor >>> the S/G >>> + * prefetch insns, so we can always run the respective tests. >>> */ >>> if ( leaf == 7 && subleaf == 0 ) >>> { >>> res->b |= (1U << 10) | (1U << 19); >>> if ( res->b & (1U << 16) ) >>> res->b |= 1U << 26; >>> - res->c |= 1U << 22; >>> + res->c |= (1U << 22) | (1U << 27) | (1U << 28); >> >> I've just noticed, but we shouldn't be having both the named variables >> and these unnamed ones. Is their presence a rebasing issue around >> patches into the test suite? > > emul_test_init() gained its current shape from fd35f32b4b, while > emul_test_cpuid() had been left untouched at that point. So I guess > it's more like a forgotten piece of conversion work. I'm unsure > though whether such a conversion is actually a good idea, since aiui > it would mean cloning at least guest_cpuid()'s first switch() into > this function, which is quite a bit more code than there is right > now. Perhaps (the common part of) that switch() could be morphed > into a library function ... I'll throw it onto the big pile of CPUID work. It is not trivial to library-fy because of the Xen/Viridian leaf handling. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns 2019-08-28 11:51 ` Andrew Cooper @ 2019-08-28 12:19 ` Jan Beulich 0 siblings, 0 replies; 36+ messages in thread From: Jan Beulich @ 2019-08-28 12:19 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, PaulDurrant, Wei Liu, RogerPau Monne On 28.08.2019 13:51, Andrew Cooper wrote: > On 28/08/2019 07:26, Jan Beulich wrote: >> On 27.08.2019 18:04, Andrew Cooper wrote: >>> On 01/07/2019 12:58, Jan Beulich wrote: >>>> @@ -9896,6 +9902,32 @@ x86_emulate( >>>> : "0" ((uint32_t)src.val), "rm" >>>> (_regs.edx) ); >>>> break; >>>> + case X86EMUL_OPC_66(0x0f38, 0xf8): /* movdir64b r,m512 */ >>>> + vcpu_must_have(movdir64b); >>>> + generate_exception_if(ea.type != OP_MEM, EXC_UD); >>>> + src.val = truncate_ea(*dst.reg); >>>> + generate_exception_if(!is_aligned(x86_seg_es, src.val, 64, >>>> ctxt, ops), >>>> + EXC_GP, 0); >>>> + /* Ignore the non-temporal behavior for now. */ >>>> + fail_if(!ops->write); >>>> + BUILD_BUG_ON(sizeof(*mmvalp) < 64); >>>> + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 64, >>>> + ctxt)) != X86EMUL_OKAY || >>>> + (rc = ops->write(x86_seg_es, src.val, mmvalp, 64, >>>> + ctxt)) != X86EMUL_OKAY ) >>>> + goto done; >>>> + state->simd_size = simd_none; >>>> + sfence = true; >>>> + break; >>>> + >>>> + case X86EMUL_OPC(0x0f38, 0xf9): /* movdiri mem,r */ >>>> + vcpu_must_have(movdiri); >>>> + generate_exception_if(dst.type != OP_MEM, EXC_UD); >>>> + /* Ignore the non-temporal behavior for now. */ >>>> + dst.val = src.val; >>>> + sfence = true; >>>> + break; >>> >>> I'm not certain this gives the required atomicity. AFAICT, it degrades >>> into ops->write(), which can end up with bytewise writes. >>> >>> I think we need to map the destination and issue an explicit mov >>> instruction. >> >> I don't think so, no - plain MOV has the same property (in particular >> when not going through the cache), and also uses the ->write() hook. >> It's the hook function that needs to behave properly for all of this >> to be correct. > > It only occurred to me after sending this email that plain MOV was > broken as well. Well, they're both in the same state, but it shouldn't be the core emulator's job to establish a mapping. In fact hvmemul_write() already calls hvmemul_map_linear_addr(). What's sub-optimal is its reliance on memcpy()'s implementation (when there's a valid GFN->MFN mapping). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 0/6] x86emul: further work 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich ` (5 preceding siblings ...) 2019-07-01 11:58 ` [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns Jan Beulich @ 2019-08-15 14:24 ` Andrew Cooper 2019-08-27 8:37 ` Jan Beulich 6 siblings, 1 reply; 36+ messages in thread From: Andrew Cooper @ 2019-08-15 14:24 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Paul Durrant, Wei Liu, Roger Pau Monne On 01/07/2019 12:47, Jan Beulich wrote: > On top of the AVX512 series I'd like to put out for review/discussion > > 1: generalize wbinvd() hook > 2: support WBNOINVD > 3: generalize invlpg() hook > 4: move INVPCID_TYPE_* to x86-defns.h > 5: support INVPCID > 6: support MOVDIR{I,64B} insns Do you have this series in patch form by any chance? I encountered WBNOINVD while doing some more Rome support, but the patches themselves are almost unreviewable given the context mangling. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [Xen-devel] [PATCH 0/6] x86emul: further work 2019-08-15 14:24 ` [Xen-devel] [PATCH 0/6] x86emul: further work Andrew Cooper @ 2019-08-27 8:37 ` Jan Beulich 0 siblings, 0 replies; 36+ messages in thread From: Jan Beulich @ 2019-08-27 8:37 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Paul Durrant, Wei Liu, Roger Pau Monne [-- Attachment #1: Type: text/plain, Size: 613 bytes --] On 15.08.2019 16:24, Andrew Cooper wrote: > On 01/07/2019 12:47, Jan Beulich wrote: >> On top of the AVX512 series I'd like to put out for review/discussion >> >> 1: generalize wbinvd() hook >> 2: support WBNOINVD >> 3: generalize invlpg() hook >> 4: move INVPCID_TYPE_* to x86-defns.h >> 5: support INVPCID >> 6: support MOVDIR{I,64B} insns > > Do you have this series in patch form by any chance? Sure, attached. > I encountered > WBNOINVD while doing some more Rome support, but the patches themselves > are almost unreviewable given the context mangling. I'm sorry for this still unresolved issue. Jan [-- Attachment #2: 01-x86emul-cache-management.patch --] [-- Type: text/plain, Size: 12460 bytes --] x86emul: generalize wbinvd() hook The hook is already in use for other purposes, and emulating e.g. CLFLUSH by issuing WBINVD is, well, not very nice. Rename the hook and add parameters. Use lighter weight flushing insns when possible in hvmemul_cache_op(). hvmemul_cache_op() treating x86emul_invd the same as x86emul_wbinvd is to retain original behavior, but I'm not sure this is what we want in the long run. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v3: Re-base. v2: Use cache_op() as hook name. Convert macros to inline functions in system.h. Re-base. --- I was unsure about PREFETCH* and CLDEMOTE - both are cache management insns too, but the emulator currently treats them as a NOP without invoking any hooks. I was also uncertain about the new cache_flush_permitted() instance - generally I think it wouldn't be too bad if we allowed line flushes in all cases, in which case the checks in the ->wbinvd_intercept() handlers would suffice (as they did until now). --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -382,10 +382,13 @@ static int fuzz_invlpg( return maybe_fail(ctxt, "invlpg", false); } -static int fuzz_wbinvd( +static int fuzz_cache_op( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { - return maybe_fail(ctxt, "wbinvd", true); + return maybe_fail(ctxt, "cache-management", true); } static int fuzz_write_io( @@ -620,7 +623,7 @@ static const struct x86_emulate_ops all_ SET(read_xcr), SET(read_msr), SET(write_msr), - SET(wbinvd), + SET(cache_op), SET(invlpg), .get_fpu = emul_test_get_fpu, .put_fpu = emul_test_put_fpu, @@ -729,7 +732,7 @@ enum { HOOK_read_xcr, HOOK_read_msr, HOOK_write_msr, - HOOK_wbinvd, + HOOK_cache_op, HOOK_cpuid, HOOK_inject_hw_exception, HOOK_inject_sw_interrupt, @@ -773,7 +776,7 @@ static void disable_hooks(struct x86_emu MAYBE_DISABLE_HOOK(read_xcr); MAYBE_DISABLE_HOOK(read_msr); MAYBE_DISABLE_HOOK(write_msr); - MAYBE_DISABLE_HOOK(wbinvd); + MAYBE_DISABLE_HOOK(cache_op); MAYBE_DISABLE_HOOK(cpuid); MAYBE_DISABLE_HOOK(get_fpu); MAYBE_DISABLE_HOOK(invlpg); --- a/xen/arch/x86/Rules.mk +++ b/xen/arch/x86/Rules.mk @@ -19,7 +19,9 @@ $(call as-option-add,CFLAGS,CC,"crc32 %e $(call as-option-add,CFLAGS,CC,"invept (%rax)$$(comma)%rax",-DHAVE_AS_EPT) $(call as-option-add,CFLAGS,CC,"rdrand %eax",-DHAVE_AS_RDRAND) $(call as-option-add,CFLAGS,CC,"rdfsbase %rax",-DHAVE_AS_FSGSBASE) +$(call as-option-add,CFLAGS,CC,"xsaveopt (%rax)",-DHAVE_AS_XSAVEOPT) $(call as-option-add,CFLAGS,CC,"rdseed %eax",-DHAVE_AS_RDSEED) +$(call as-option-add,CFLAGS,CC,"clwb (%rax)",-DHAVE_AS_CLWB) $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1", \ -U__OBJECT_LABEL__ -DHAVE_AS_QUOTED_SYM \ '-D__OBJECT_LABEL__=$(subst $(BASEDIR)/,,$(CURDIR))/$$@') --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -25,6 +25,7 @@ #include <asm/hvm/trace.h> #include <asm/hvm/support.h> #include <asm/hvm/svm/svm.h> +#include <asm/iocap.h> #include <asm/vm_event.h> static void hvmtrace_io_assist(const ioreq_t *p) @@ -555,16 +556,12 @@ static void *hvmemul_map_linear_addr( mfn_t *mfn = &hvmemul_ctxt->mfn[0]; /* - * The caller has no legitimate reason for trying a zero-byte write, but - * all other code here is written to work if the check below was dropped. - * - * The maximum write size depends on the number of adjacent mfns[] which + * The maximum access size depends on the number of adjacent mfns[] which * can be vmap()'d, accouting for possible misalignment within the region. * The higher level emulation callers are responsible for ensuring that - * mfns[] is large enough for the requested write size. + * mfns[] is large enough for the requested access size. */ - if ( bytes == 0 || - nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) + if ( nr_frames > ARRAY_SIZE(hvmemul_ctxt->mfn) ) { ASSERT_UNREACHABLE(); goto unhandleable; @@ -669,8 +666,6 @@ static void hvmemul_unmap_linear_addr( unsigned int i; mfn_t *mfn = &hvmemul_ctxt->mfn[0]; - ASSERT(bytes > 0); - if ( nr_frames == 1 ) unmap_domain_page(mapping); else @@ -1483,7 +1478,10 @@ static int hvmemul_write_msr_discard( return X86EMUL_OKAY; } -static int hvmemul_wbinvd_discard( +static int hvmemul_cache_op_discard( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { return X86EMUL_OKAY; @@ -2159,10 +2157,65 @@ static int hvmemul_write_msr( return rc; } -static int hvmemul_wbinvd( +static int hvmemul_cache_op( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { - alternative_vcall(hvm_funcs.wbinvd_intercept); + struct hvm_emulate_ctxt *hvmemul_ctxt = + container_of(ctxt, struct hvm_emulate_ctxt, ctxt); + uint32_t pfec = PFEC_page_present; + + if ( !cache_flush_permitted(current->domain) ) + return X86EMUL_OKAY; + + switch ( op ) + { + unsigned long addr; + int rc; + void *mapping; + + case x86emul_clflush: + case x86emul_clflushopt: + case x86emul_clwb: + ASSERT(!is_x86_system_segment(seg)); + + rc = hvmemul_virtual_to_linear(seg, offset, 0, NULL, + hvm_access_read, hvmemul_ctxt, &addr); + if ( rc != X86EMUL_OKAY ) + break; + + if ( hvmemul_ctxt->seg_reg[x86_seg_ss].dpl == 3 ) + pfec |= PFEC_user_mode; + + mapping = hvmemul_map_linear_addr(addr, 0, pfec, hvmemul_ctxt); + if ( mapping == ERR_PTR(~X86EMUL_EXCEPTION) ) + return X86EMUL_EXCEPTION; + if ( IS_ERR_OR_NULL(mapping) ) + break; + + if ( cpu_has_clflush ) + { + if ( op == x86emul_clwb && cpu_has_clwb ) + clwb(mapping); + else if ( op == x86emul_clflushopt && cpu_has_clflushopt ) + clflushopt(mapping); + else + clflush(mapping); + + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); + break; + } + + hvmemul_unmap_linear_addr(mapping, addr, 0, hvmemul_ctxt); + /* fall through */ + case x86emul_invd: + case x86emul_wbinvd: + alternative_vcall(hvm_funcs.wbinvd_intercept); + break; + } + return X86EMUL_OKAY; } @@ -2363,7 +2416,7 @@ static const struct x86_emulate_ops hvm_ .write_xcr = hvmemul_write_xcr, .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr, - .wbinvd = hvmemul_wbinvd, + .cache_op = hvmemul_cache_op, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, @@ -2390,7 +2443,7 @@ static const struct x86_emulate_ops hvm_ .write_xcr = hvmemul_write_xcr, .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr_discard, - .wbinvd = hvmemul_wbinvd_discard, + .cache_op = hvmemul_cache_op_discard, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1117,9 +1117,11 @@ static int write_msr(unsigned int reg, u return X86EMUL_UNHANDLEABLE; } -/* Name it differently to avoid clashing with wbinvd() */ -static int _wbinvd(struct x86_emulate_ctxt *ctxt) +static int cache_op(enum x86emul_cache_op op, enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt) { + ASSERT(op == x86emul_wbinvd); + /* Ignore the instruction if unprivileged. */ if ( !cache_flush_permitted(current->domain) ) /* @@ -1237,7 +1239,7 @@ static const struct x86_emulate_ops priv .read_msr = read_msr, .write_msr = write_msr, .cpuid = x86emul_cpuid, - .wbinvd = _wbinvd, + .cache_op = cache_op, }; int pv_emulate_privileged_op(struct cpu_user_regs *regs) --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -5933,8 +5933,11 @@ x86_emulate( case X86EMUL_OPC(0x0f, 0x08): /* invd */ case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */ generate_exception_if(!mode_ring0(), EXC_GP, 0); - fail_if(ops->wbinvd == NULL); - if ( (rc = ops->wbinvd(ctxt)) != 0 ) + fail_if(!ops->cache_op); + if ( (rc = ops->cache_op(b == 0x09 ? x86emul_wbinvd + : x86emul_invd, + x86_seg_none, 0, + ctxt)) != X86EMUL_OKAY ) goto done; break; @@ -7801,8 +7804,9 @@ x86_emulate( /* else clwb */ fail_if(!vex.pfx); vcpu_must_have(clwb); - fail_if(!ops->wbinvd); - if ( (rc = ops->wbinvd(ctxt)) != X86EMUL_OKAY ) + fail_if(!ops->cache_op); + if ( (rc = ops->cache_op(x86emul_clwb, ea.mem.seg, ea.mem.off, + ctxt)) != X86EMUL_OKAY ) goto done; break; case 7: @@ -7818,8 +7822,11 @@ x86_emulate( vcpu_must_have(clflush); else vcpu_must_have(clflushopt); - fail_if(ops->wbinvd == NULL); - if ( (rc = ops->wbinvd(ctxt)) != 0 ) + fail_if(!ops->cache_op); + if ( (rc = ops->cache_op(vex.pfx ? x86emul_clflushopt + : x86emul_clflush, + ea.mem.seg, ea.mem.off, + ctxt)) != X86EMUL_OKAY ) goto done; break; default: --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -176,6 +176,14 @@ enum x86_emulate_fpu_type { X86EMUL_FPU_none }; +enum x86emul_cache_op { + x86emul_clflush, + x86emul_clflushopt, + x86emul_clwb, + x86emul_invd, + x86emul_wbinvd, +}; + struct x86_emulate_state; /* @@ -452,8 +460,15 @@ struct x86_emulate_ops uint64_t val, struct x86_emulate_ctxt *ctxt); - /* wbinvd: Write-back and invalidate cache contents. */ - int (*wbinvd)( + /* + * cache_op: Write-back and/or invalidate cache contents. + * + * @seg:@offset applicable only to some of enum x86emul_cache_op. + */ + int (*cache_op)( + enum x86emul_cache_op op, + enum x86_segment seg, + unsigned long offset, struct x86_emulate_ctxt *ctxt); /* cpuid: Emulate CPUID via given set of EAX-EDX inputs/outputs. */ --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -102,6 +102,8 @@ #define cpu_has_rdseed boot_cpu_has(X86_FEATURE_RDSEED) #define cpu_has_smap boot_cpu_has(X86_FEATURE_SMAP) #define cpu_has_avx512_ifma boot_cpu_has(X86_FEATURE_AVX512_IFMA) +#define cpu_has_clflushopt boot_cpu_has(X86_FEATURE_CLFLUSHOPT) +#define cpu_has_clwb boot_cpu_has(X86_FEATURE_CLWB) #define cpu_has_avx512er boot_cpu_has(X86_FEATURE_AVX512ER) #define cpu_has_avx512cd boot_cpu_has(X86_FEATURE_AVX512CD) #define cpu_has_sha boot_cpu_has(X86_FEATURE_SHA) --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -21,6 +21,23 @@ static inline void clflush(const void *p asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); } +static inline void clflushopt(const void *p) +{ + asm volatile ( "data16 clflush %0" :: "m" (*(const char *)p) ); +} + +static inline void clwb(const void *p) +{ +#if defined(HAVE_AS_CLWB) + asm volatile ( "clwb %0" :: "m" (*(const char *)p) ); +#elif defined(HAVE_AS_XSAVEOPT) + asm volatile ( "data16 xsaveopt %0" :: "m" (*(const char *)p) ); +#else + asm volatile ( ".byte 0x66, 0x0f, 0xae, 0x32" + :: "d" (p), "m" (*(const char *)p) ); +#endif +} + #define xchg(ptr,v) \ ((__typeof__(*(ptr)))__xchg((unsigned long)(v),(ptr),sizeof(*(ptr)))) [-- Attachment #3: 02-x86emul-WBNOINVD.patch --] [-- Type: text/plain, Size: 6558 bytes --] TODO: Paul vaguely indicates WBNOINVD will have an exit qualification of 1. Whether that's an enum or bit mask is unclear. May want to integrate this nevertheless. x86emul: support WBNOINVD Rev 035 of Intel's ISA extensions document does not state intercept behavior for the insn (I've been in-officially told that the distinction is going to be by exit qualification, as I would have assumed considering that this way it's sufficiently transparent to unaware software, and using WBINVD in place of WBNOINVD is always correct, just less efficient), so in the HVM case for now it'll be backed by the same ->wbinvd_intercept() handlers. Use this occasion and also add the two missing table entries for CLDEMOTE, which doesn't require any further changes to make work. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> --- v2: Re-base. Convert wbnoinvd() inline function. --- a/tools/libxl/libxl_cpuid.c +++ b/tools/libxl/libxl_cpuid.c @@ -208,6 +208,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"avx512-bitalg",0x00000007, 0, CPUID_REG_ECX, 12, 1}, {"avx512-vpopcntdq",0x00000007,0,CPUID_REG_ECX, 14, 1}, {"rdpid", 0x00000007, 0, CPUID_REG_ECX, 22, 1}, + {"cldemote", 0x00000007, 0, CPUID_REG_ECX, 25, 1}, {"avx512-4vnniw",0x00000007, 0, CPUID_REG_EDX, 2, 1}, {"avx512-4fmaps",0x00000007, 0, CPUID_REG_EDX, 3, 1}, @@ -256,6 +257,7 @@ int libxl_cpuid_parse_config(libxl_cpuid {"invtsc", 0x80000007, NA, CPUID_REG_EDX, 8, 1}, + {"wbnoinvd", 0x80000008, NA, CPUID_REG_EBX, 9, 1}, {"ibpb", 0x80000008, NA, CPUID_REG_EBX, 12, 1}, {"nc", 0x80000008, NA, CPUID_REG_ECX, 0, 8}, {"apicidsize", 0x80000008, NA, CPUID_REG_ECX, 12, 4}, --- a/tools/misc/xen-cpuid.c +++ b/tools/misc/xen-cpuid.c @@ -146,6 +146,8 @@ static const char *const str_e8b[32] = { [ 0] = "clzero", + /* [ 8] */ [ 9] = "wbnoinvd", + [12] = "ibpb", }; --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2212,6 +2212,7 @@ static int hvmemul_cache_op( /* fall through */ case x86emul_invd: case x86emul_wbinvd: + case x86emul_wbnoinvd: alternative_vcall(hvm_funcs.wbinvd_intercept); break; } --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1120,7 +1120,7 @@ static int write_msr(unsigned int reg, u static int cache_op(enum x86emul_cache_op op, enum x86_segment seg, unsigned long offset, struct x86_emulate_ctxt *ctxt) { - ASSERT(op == x86emul_wbinvd); + ASSERT(op == x86emul_wbinvd || op == x86emul_wbnoinvd); /* Ignore the instruction if unprivileged. */ if ( !cache_flush_permitted(current->domain) ) @@ -1129,6 +1129,8 @@ static int cache_op(enum x86emul_cache_o * newer linux uses this in some start-of-day timing loops. */ ; + else if ( op == x86emul_wbnoinvd && cpu_has_wbnoinvd ) + wbnoinvd(); else wbinvd(); --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -1869,6 +1869,7 @@ in_protmode( #define vcpu_has_fma4() (ctxt->cpuid->extd.fma4) #define vcpu_has_tbm() (ctxt->cpuid->extd.tbm) #define vcpu_has_clzero() (ctxt->cpuid->extd.clzero) +#define vcpu_has_wbnoinvd() (ctxt->cpuid->extd.wbnoinvd) #define vcpu_has_bmi1() (ctxt->cpuid->feat.bmi1) #define vcpu_has_hle() (ctxt->cpuid->feat.hle) @@ -5931,10 +5932,13 @@ x86_emulate( break; case X86EMUL_OPC(0x0f, 0x08): /* invd */ - case X86EMUL_OPC(0x0f, 0x09): /* wbinvd */ + case X86EMUL_OPC(0x0f, 0x09): /* wbinvd / wbnoinvd */ generate_exception_if(!mode_ring0(), EXC_GP, 0); fail_if(!ops->cache_op); - if ( (rc = ops->cache_op(b == 0x09 ? x86emul_wbinvd + if ( (rc = ops->cache_op(b == 0x09 ? !repe_prefix() || + !vcpu_has_wbnoinvd() + ? x86emul_wbinvd + : x86emul_wbnoinvd : x86emul_invd, x86_seg_none, 0, ctxt)) != X86EMUL_OKAY ) --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -182,6 +182,7 @@ enum x86emul_cache_op { x86emul_clwb, x86emul_invd, x86emul_wbinvd, + x86emul_wbnoinvd, }; struct x86_emulate_state; --- a/xen/include/asm-x86/cpufeature.h +++ b/xen/include/asm-x86/cpufeature.h @@ -129,6 +129,9 @@ #define cpu_has_avx512_4fmaps boot_cpu_has(X86_FEATURE_AVX512_4FMAPS) #define cpu_has_tsx_force_abort boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT) +/* CPUID level 0x80000008.ebx */ +#define cpu_has_wbnoinvd boot_cpu_has(X86_FEATURE_WBNOINVD) + /* Synthesized. */ #define cpu_has_arch_perfmon boot_cpu_has(X86_FEATURE_ARCH_PERFMON) #define cpu_has_cpuid_faulting boot_cpu_has(X86_FEATURE_CPUID_FAULTING) --- a/xen/include/asm-x86/system.h +++ b/xen/include/asm-x86/system.h @@ -16,6 +16,11 @@ static inline void wbinvd(void) asm volatile ( "wbinvd" ::: "memory" ); } +static inline void wbnoinvd(void) +{ + asm volatile ( "repe; wbinvd" : : : "memory" ); +} + static inline void clflush(const void *p) { asm volatile ( "clflush %0" :: "m" (*(const char *)p) ); --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -236,6 +236,7 @@ XEN_CPUFEATURE(AVX512_VNNI, 6*32+11) / XEN_CPUFEATURE(AVX512_BITALG, 6*32+12) /*A Support for VPOPCNT[B,W] and VPSHUFBITQMB */ XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */ XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */ +XEN_CPUFEATURE(CLDEMOTE, 6*32+25) /*A CLDEMOTE instruction */ /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */ XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */ @@ -243,6 +244,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ XEN_CPUFEATURE(IBPB, 8*32+12) /*A IBPB support only (no IBRS, used by AMD) */ /* Intel-defined CPU features, CPUID level 0x00000007:0.edx, word 9 */ [-- Attachment #4: 03-x86emul-TLB-management.patch --] [-- Type: text/plain, Size: 8085 bytes --] x86emul: generalize invlpg() hook The hook is already in use for INVLPGA as well. Rename the hook and add parameters. For the moment INVLPGA with a non-zero ASID remains unsupported, but the TODO item gets pushed into the actual hook handler. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> --- v2: New. --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -370,16 +370,23 @@ static int fuzz_cmpxchg( return maybe_fail(ctxt, "cmpxchg", true); } -static int fuzz_invlpg( - enum x86_segment seg, - unsigned long offset, +static int fuzz_tlb_op( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, struct x86_emulate_ctxt *ctxt) { - /* invlpg(), unlike all other hooks, may be called with x86_seg_none. */ - assert(is_x86_user_segment(seg) || seg == x86_seg_none); - assert(ctxt->addr_size == 64 || !(offset >> 32)); + switch ( op ) + { + case x86emul_invlpg: + assert(is_x86_user_segment(aux)); + /* fall through */ + case x86emul_invlpga: + assert(ctxt->addr_size == 64 || !(addr >> 32)); + break; + } - return maybe_fail(ctxt, "invlpg", false); + return maybe_fail(ctxt, "TLB-management", false); } static int fuzz_cache_op( @@ -624,7 +631,7 @@ static const struct x86_emulate_ops all_ SET(read_msr), SET(write_msr), SET(cache_op), - SET(invlpg), + SET(tlb_op), .get_fpu = emul_test_get_fpu, .put_fpu = emul_test_put_fpu, .cpuid = emul_test_cpuid, @@ -733,12 +740,12 @@ enum { HOOK_read_msr, HOOK_write_msr, HOOK_cache_op, + HOOK_tlb_op, HOOK_cpuid, HOOK_inject_hw_exception, HOOK_inject_sw_interrupt, HOOK_get_fpu, HOOK_put_fpu, - HOOK_invlpg, HOOK_vmfunc, CANONICALIZE_rip, CANONICALIZE_rsp, @@ -777,9 +784,9 @@ static void disable_hooks(struct x86_emu MAYBE_DISABLE_HOOK(read_msr); MAYBE_DISABLE_HOOK(write_msr); MAYBE_DISABLE_HOOK(cache_op); + MAYBE_DISABLE_HOOK(tlb_op); MAYBE_DISABLE_HOOK(cpuid); MAYBE_DISABLE_HOOK(get_fpu); - MAYBE_DISABLE_HOOK(invlpg); } /* --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2349,36 +2349,52 @@ static void hvmemul_put_fpu( } } -static int hvmemul_invlpg( - enum x86_segment seg, - unsigned long offset, +static int hvmemul_tlb_op( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, struct x86_emulate_ctxt *ctxt) { struct hvm_emulate_ctxt *hvmemul_ctxt = container_of(ctxt, struct hvm_emulate_ctxt, ctxt); - unsigned long addr; - int rc; - - rc = hvmemul_virtual_to_linear( - seg, offset, 1, NULL, hvm_access_none, hvmemul_ctxt, &addr); + int rc = X86EMUL_OKAY; - if ( rc == X86EMUL_EXCEPTION ) + switch ( op ) { - /* - * `invlpg` takes segment bases into account, but is not subject to - * faults from segment type/limit checks, and is specified as a NOP - * when issued on non-canonical addresses. - * - * hvmemul_virtual_to_linear() raises exceptions for type/limit - * violations, so squash them. - */ - x86_emul_reset_event(ctxt); - rc = X86EMUL_OKAY; + case x86emul_invlpg: + rc = hvmemul_virtual_to_linear(aux, addr, 1, NULL, hvm_access_none, + hvmemul_ctxt, &addr); + + if ( rc == X86EMUL_EXCEPTION ) + { + /* + * `invlpg` takes segment bases into account, but is not subject + * to faults from segment type/limit checks, and is specified as + * a NOP when issued on non-canonical addresses. + * + * hvmemul_virtual_to_linear() raises exceptions for type/limit + * violations, so squash them. + */ + x86_emul_reset_event(ctxt); + rc = X86EMUL_OKAY; + } + + if ( rc == X86EMUL_OKAY ) + paging_invlpg(current, addr); + break; + + case x86emul_invlpga: + /* TODO: Support ASIDs. */ + if ( !aux ) + paging_invlpg(current, addr); + else + { + x86_emul_hw_exception(TRAP_invalid_op, X86_EVENT_NO_EC, ctxt); + rc = X86EMUL_EXCEPTION; + } + break; } - if ( rc == X86EMUL_OKAY ) - paging_invlpg(current, addr); - return rc; } @@ -2418,10 +2434,10 @@ static const struct x86_emulate_ops hvm_ .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr, .cache_op = hvmemul_cache_op, + .tlb_op = hvmemul_tlb_op, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, - .invlpg = hvmemul_invlpg, .vmfunc = hvmemul_vmfunc, }; @@ -2445,10 +2461,10 @@ static const struct x86_emulate_ops hvm_ .read_msr = hvmemul_read_msr, .write_msr = hvmemul_write_msr_discard, .cache_op = hvmemul_cache_op_discard, + .tlb_op = hvmemul_tlb_op, .cpuid = x86emul_cpuid, .get_fpu = hvmemul_get_fpu, .put_fpu = hvmemul_put_fpu, - .invlpg = hvmemul_invlpg, .vmfunc = hvmemul_vmfunc, }; --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -5590,10 +5590,9 @@ x86_emulate( generate_exception_if(!(msr_val & EFER_SVME) || !in_protmode(ctxt, ops), EXC_UD); generate_exception_if(!mode_ring0(), EXC_GP, 0); - generate_exception_if(_regs.ecx, EXC_UD); /* TODO: Support ASIDs. */ - fail_if(ops->invlpg == NULL); - if ( (rc = ops->invlpg(x86_seg_none, truncate_ea(_regs.r(ax)), - ctxt)) ) + fail_if(!ops->tlb_op); + if ( (rc = ops->tlb_op(x86emul_invlpga, truncate_ea(_regs.r(ax)), + _regs.ecx, ctxt)) != X86EMUL_OKAY ) goto done; break; @@ -5747,8 +5746,9 @@ x86_emulate( case GRP7_MEM(7): /* invlpg */ ASSERT(ea.type == OP_MEM); generate_exception_if(!mode_ring0(), EXC_GP, 0); - fail_if(ops->invlpg == NULL); - if ( (rc = ops->invlpg(ea.mem.seg, ea.mem.off, ctxt)) ) + fail_if(!ops->tlb_op); + if ( (rc = ops->tlb_op(x86emul_invlpg, ea.mem.off, ea.mem.seg, + ctxt)) != X86EMUL_OKAY ) goto done; break; --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -185,6 +185,11 @@ enum x86emul_cache_op { x86emul_wbnoinvd, }; +enum x86emul_tlb_op { + x86emul_invlpg, + x86emul_invlpga, +}; + struct x86_emulate_state; /* @@ -472,6 +477,19 @@ struct x86_emulate_ops unsigned long offset, struct x86_emulate_ctxt *ctxt); + /* + * tlb_op: Invalidate paging structures which map addressed byte. + * + * @addr and @aux have @op-specific meaning: + * - INVLPG: @aux:@addr represent seg:offset + * - INVLPGA: @addr is the linear address, @aux the ASID + */ + int (*tlb_op)( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, + struct x86_emulate_ctxt *ctxt); + /* cpuid: Emulate CPUID via given set of EAX-EDX inputs/outputs. */ int (*cpuid)( uint32_t leaf, @@ -499,12 +517,6 @@ struct x86_emulate_ops enum x86_emulate_fpu_type backout, const struct x86_emul_fpu_aux *aux); - /* invlpg: Invalidate paging structures which map addressed byte. */ - int (*invlpg)( - enum x86_segment seg, - unsigned long offset, - struct x86_emulate_ctxt *ctxt); - /* vmfunc: Emulate VMFUNC via given set of EAX ECX inputs */ int (*vmfunc)( struct x86_emulate_ctxt *ctxt); [-- Attachment #5: 04-x86-INVPCID-defs.patch --] [-- Type: text/plain, Size: 2085 bytes --] x86: move INVPCID_TYPE_* to x86-defns.h This way the insn emulator can then too use the #define-s. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> --- v2: New. --- a/xen/include/asm-x86/invpcid.h +++ b/xen/include/asm-x86/invpcid.h @@ -5,11 +5,6 @@ extern bool use_invpcid; -#define INVPCID_TYPE_INDIV_ADDR 0 -#define INVPCID_TYPE_SINGLE_CTXT 1 -#define INVPCID_TYPE_ALL_INCL_GLOBAL 2 -#define INVPCID_TYPE_ALL_NON_GLOBAL 3 - #define INVPCID_OPCODE ".byte 0x66, 0x0f, 0x38, 0x82\n" #define MODRM_ECX_01 ".byte 0x01\n" @@ -38,25 +33,25 @@ static inline void invpcid(unsigned int /* Flush all mappings for a given PCID and addr, not including globals */ static inline void invpcid_flush_one(unsigned int pcid, unsigned long addr) { - invpcid(pcid, addr, INVPCID_TYPE_INDIV_ADDR); + invpcid(pcid, addr, X86_INVPCID_TYPE_INDIV_ADDR); } /* Flush all mappings for a given PCID, not including globals */ static inline void invpcid_flush_single_context(unsigned int pcid) { - invpcid(pcid, 0, INVPCID_TYPE_SINGLE_CTXT); + invpcid(pcid, 0, X86_INVPCID_TYPE_SINGLE_CTXT); } /* Flush all mappings, including globals, for all PCIDs */ static inline void invpcid_flush_all(void) { - invpcid(0, 0, INVPCID_TYPE_ALL_INCL_GLOBAL); + invpcid(0, 0, X86_INVPCID_TYPE_ALL_INCL_GLOBAL); } /* Flush all mappings for all PCIDs, excluding globals */ static inline void invpcid_flush_all_nonglobals(void) { - invpcid(0, 0, INVPCID_TYPE_ALL_NON_GLOBAL); + invpcid(0, 0, X86_INVPCID_TYPE_ALL_NON_GLOBAL); } #endif /* _ASM_X86_INVPCID_H_ */ --- a/xen/include/asm-x86/x86-defns.h +++ b/xen/include/asm-x86/x86-defns.h @@ -108,4 +108,12 @@ */ #define X86_DR7_DEFAULT 0x00000400 /* Default %dr7 value. */ +/* + * Invalidation types for the INVPCID instruction. + */ +#define X86_INVPCID_TYPE_INDIV_ADDR 0 +#define X86_INVPCID_TYPE_SINGLE_CTXT 1 +#define X86_INVPCID_TYPE_ALL_INCL_GLOBAL 2 +#define X86_INVPCID_TYPE_ALL_NON_GLOBAL 3 + #endif /* __XEN_X86_DEFNS_H__ */ [-- Attachment #6: 05-x86emul-INVPCID.patch --] [-- Type: text/plain, Size: 8237 bytes --] x86emul: support INVPCID Just like for INVLPGA the HVM hook only supports PCID 0 for the time being for individual address invalidation. It also translates the other types to a full flush, which is architecturally permitted and performance-wise presumably not much worse because emulation is slow anyway. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> --- v2: New. --- a/tools/fuzz/x86_instruction_emulator/fuzz-emul.c +++ b/tools/fuzz/x86_instruction_emulator/fuzz-emul.c @@ -382,6 +382,7 @@ static int fuzz_tlb_op( assert(is_x86_user_segment(aux)); /* fall through */ case x86emul_invlpga: + case x86emul_invpcid: assert(ctxt->addr_size == 64 || !(addr >> 32)); break; } --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -684,6 +684,38 @@ static int read_msr( return X86EMUL_UNHANDLEABLE; } +#define INVPCID_ADDR 0x12345678 +#define INVPCID_PCID 0x123 + +static int read_cr_invpcid( + unsigned int reg, + unsigned long *val, + struct x86_emulate_ctxt *ctxt) +{ + int rc = emul_test_read_cr(reg, val, ctxt); + + if ( rc == X86EMUL_OKAY && reg == 4 ) + *val |= X86_CR4_PCIDE; + + return rc; +} + +static int tlb_op_invpcid( + enum x86emul_tlb_op op, + unsigned long addr, + unsigned long aux, + struct x86_emulate_ctxt *ctxt) +{ + static unsigned int seq; + + if ( op != x86emul_invpcid || addr != INVPCID_ADDR || + x86emul_invpcid_pcid(aux) != (seq < 4 ? 0 : INVPCID_PCID) || + x86emul_invpcid_type(aux) != (seq++ & 3) ) + return X86EMUL_UNHANDLEABLE; + + return X86EMUL_OKAY; +} + static struct x86_emulate_ops emulops = { .read = read, .insn_fetch = fetch, @@ -4482,6 +4514,46 @@ int main(int argc, char **argv) printf("okay\n"); } else + printf("skipped\n"); + + printf("%-40s", "Testing invpcid 16(%ecx),%%edx..."); + if ( stack_exec ) + { + decl_insn(invpcid); + + asm volatile ( put_insn(invpcid, "invpcid 16(%0), %1") + :: "c" (NULL), "d" (0L) ); + + res[4] = 0; + res[5] = 0; + res[6] = INVPCID_ADDR; + res[7] = 0; + regs.ecx = (unsigned long)res; + emulops.tlb_op = tlb_op_invpcid; + + for ( ; ; ) + { + for ( regs.edx = 0; regs.edx < 4; ++regs.edx ) + { + set_insn(invpcid); + rc = x86_emulate(&ctxt, &emulops); + if ( rc != X86EMUL_OKAY || !check_eip(invpcid) ) + goto fail; + } + + if ( ctxt.addr_size < 64 || res[4] == INVPCID_PCID ) + break; + + emulops.read_cr = read_cr_invpcid; + res[4] = INVPCID_PCID; + } + + emulops.read_cr = emul_test_read_cr; + emulops.tlb_op = NULL; + + printf("okay\n"); + } + else printf("skipped\n"); #undef decl_insn --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -72,6 +72,7 @@ bool emul_test_init(void) * them. */ cp.basic.movbe = true; + cp.feat.invpcid = true; cp.feat.adx = true; cp.feat.avx512pf = cp.feat.avx512f; cp.feat.rdpid = true; @@ -141,7 +142,7 @@ int emul_test_cpuid( */ if ( leaf == 7 && subleaf == 0 ) { - res->b |= 1U << 19; + res->b |= (1U << 10) | (1U << 19); if ( res->b & (1U << 16) ) res->b |= 1U << 26; res->c |= 1U << 22; --- a/xen/arch/x86/hvm/emulate.c +++ b/xen/arch/x86/hvm/emulate.c @@ -2383,8 +2383,16 @@ static int hvmemul_tlb_op( paging_invlpg(current, addr); break; + case x86emul_invpcid: + if ( x86emul_invpcid_type(aux) != X86_INVPCID_TYPE_INDIV_ADDR ) + { + hvm_asid_flush_vcpu(current); + break; + } + aux = x86emul_invpcid_pcid(aux); + /* fall through */ case x86emul_invlpga: - /* TODO: Support ASIDs. */ + /* TODO: Support ASIDs/PCIDs. */ if ( !aux ) paging_invlpg(current, addr); else --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -496,6 +496,7 @@ static const struct ext0f38_table { [0x7a ... 0x7c] = { .simd_size = simd_none, .two_op = 1 }, [0x7d ... 0x7e] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x7f] = { .simd_size = simd_packed_fp, .d8s = d8s_vl }, + [0x82] = { .simd_size = simd_other }, [0x83] = { .simd_size = simd_packed_int, .d8s = d8s_vl }, [0x88] = { .simd_size = simd_packed_fp, .two_op = 1, .d8s = d8s_dq }, [0x89] = { .simd_size = simd_packed_int, .two_op = 1, .d8s = d8s_dq }, @@ -1875,6 +1876,7 @@ in_protmode( #define vcpu_has_hle() (ctxt->cpuid->feat.hle) #define vcpu_has_avx2() (ctxt->cpuid->feat.avx2) #define vcpu_has_bmi2() (ctxt->cpuid->feat.bmi2) +#define vcpu_has_invpcid() (ctxt->cpuid->feat.invpcid) #define vcpu_has_rtm() (ctxt->cpuid->feat.rtm) #define vcpu_has_mpx() (ctxt->cpuid->feat.mpx) #define vcpu_has_avx512f() (ctxt->cpuid->feat.avx512f) @@ -9124,6 +9126,48 @@ x86_emulate( ASSERT(!state->simd_size); break; + case X86EMUL_OPC_66(0x0f38, 0x82): /* invpcid reg,m128 */ + vcpu_must_have(invpcid); + generate_exception_if(ea.type != OP_MEM, EXC_UD); + generate_exception_if(!mode_ring0(), EXC_GP, 0); + + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 16, + ctxt)) != X86EMUL_OKAY ) + goto done; + + generate_exception_if(mmvalp->xmm[0] & ~0xfff, EXC_GP, 0); + dst.val = mode_64bit() ? *dst.reg : (uint32_t)*dst.reg; + + switch ( dst.val ) + { + case X86_INVPCID_TYPE_INDIV_ADDR: + generate_exception_if(!is_canonical_address(mmvalp->xmm[1]), + EXC_GP, 0); + /* fall through */ + case X86_INVPCID_TYPE_SINGLE_CTXT: + if ( !mode_64bit() || !ops->read_cr ) + cr4 = 0; + else if ( (rc = ops->read_cr(4, &cr4, ctxt)) != X86EMUL_OKAY ) + goto done; + generate_exception_if(!(cr4 & X86_CR4_PCIDE) && mmvalp->xmm[0], + EXC_GP, 0); + break; + case X86_INVPCID_TYPE_ALL_INCL_GLOBAL: + case X86_INVPCID_TYPE_ALL_NON_GLOBAL: + break; + default: + generate_exception(EXC_GP, 0); + } + + fail_if(!ops->tlb_op); + if ( (rc = ops->tlb_op(x86emul_invpcid, truncate_ea(mmvalp->xmm[1]), + x86emul_invpcid_aux(mmvalp->xmm[0], dst.val), + ctxt)) != X86EMUL_OKAY ) + goto done; + + state->simd_size = simd_none; + break; + case X86EMUL_OPC_EVEX_66(0x0f38, 0x83): /* vpmultishiftqb [xyz]mm/mem,[xyz]mm,[xyz]mm{k} */ generate_exception_if(!evex.w, EXC_UD); host_and_vcpu_must_have(avx512_vbmi); --- a/xen/arch/x86/x86_emulate/x86_emulate.h +++ b/xen/arch/x86/x86_emulate/x86_emulate.h @@ -188,8 +188,26 @@ enum x86emul_cache_op { enum x86emul_tlb_op { x86emul_invlpg, x86emul_invlpga, + x86emul_invpcid, }; +static inline unsigned int x86emul_invpcid_aux(unsigned int pcid, + unsigned int type) +{ + ASSERT(!(pcid & ~0xfff)); + return (type << 12) | pcid; +} + +static inline unsigned int x86emul_invpcid_pcid(unsigned int aux) +{ + return aux & 0xfff; +} + +static inline unsigned int x86emul_invpcid_type(unsigned int aux) +{ + return aux >> 12; +} + struct x86_emulate_state; /* @@ -483,6 +501,8 @@ struct x86_emulate_ops * @addr and @aux have @op-specific meaning: * - INVLPG: @aux:@addr represent seg:offset * - INVLPGA: @addr is the linear address, @aux the ASID + * - INVPCID: @addr is the linear address, @aux the combination of + * PCID and type (see x86emul_invpcid_*()). */ int (*tlb_op)( enum x86emul_tlb_op op, [-- Attachment #7: 06-x86emul-MOVDIR.patch --] [-- Type: text/plain, Size: 5975 bytes --] x86emul: support MOVDIR{I,64B} insns Note that the ISA extensions document revision 035 doesn't specify exception behavior for ModRM.mod != 0b11; assuming #UD here. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/x86_emulate/x86_emulate.c +++ b/xen/arch/x86/x86_emulate/x86_emulate.c @@ -548,6 +548,8 @@ static const struct ext0f38_table { [0xf1] = { .to_mem = 1, .two_op = 1 }, [0xf2 ... 0xf3] = {}, [0xf5 ... 0xf7] = {}, + [0xf8] = { .simd_size = simd_other }, + [0xf9] = { .to_mem = 1 }, }; /* Shift values between src and dst sizes of pmov{s,z}x{b,w,d}{w,d,q}. */ @@ -1902,6 +1904,8 @@ in_protmode( #define vcpu_has_avx512_bitalg() (ctxt->cpuid->feat.avx512_bitalg) #define vcpu_has_avx512_vpopcntdq() (ctxt->cpuid->feat.avx512_vpopcntdq) #define vcpu_has_rdpid() (ctxt->cpuid->feat.rdpid) +#define vcpu_has_movdiri() (ctxt->cpuid->feat.movdiri) +#define vcpu_has_movdir64b() (ctxt->cpuid->feat.movdir64b) #define vcpu_has_avx512_4vnniw() (ctxt->cpuid->feat.avx512_4vnniw) #define vcpu_has_avx512_4fmaps() (ctxt->cpuid->feat.avx512_4fmaps) @@ -2693,10 +2697,12 @@ x86_decode_0f38( { case 0x00 ... 0xef: case 0xf2 ... 0xf5: - case 0xf7 ... 0xff: + case 0xf7 ... 0xf8: + case 0xfa ... 0xff: op_bytes = 0; /* fall through */ case 0xf6: /* adcx / adox */ + case 0xf9: /* movdiri */ ctxt->opcode |= MASK_INSR(vex.pfx, X86EMUL_OPC_PFX_MASK); break; @@ -9896,6 +9902,32 @@ x86_emulate( : "0" ((uint32_t)src.val), "rm" (_regs.edx) ); break; + case X86EMUL_OPC_66(0x0f38, 0xf8): /* movdir64b r,m512 */ + vcpu_must_have(movdir64b); + generate_exception_if(ea.type != OP_MEM, EXC_UD); + src.val = truncate_ea(*dst.reg); + generate_exception_if(!is_aligned(x86_seg_es, src.val, 64, ctxt, ops), + EXC_GP, 0); + /* Ignore the non-temporal behavior for now. */ + fail_if(!ops->write); + BUILD_BUG_ON(sizeof(*mmvalp) < 64); + if ( (rc = ops->read(ea.mem.seg, ea.mem.off, mmvalp, 64, + ctxt)) != X86EMUL_OKAY || + (rc = ops->write(x86_seg_es, src.val, mmvalp, 64, + ctxt)) != X86EMUL_OKAY ) + goto done; + state->simd_size = simd_none; + sfence = true; + break; + + case X86EMUL_OPC(0x0f38, 0xf9): /* movdiri mem,r */ + vcpu_must_have(movdiri); + generate_exception_if(dst.type != OP_MEM, EXC_UD); + /* Ignore the non-temporal behavior for now. */ + dst.val = src.val; + sfence = true; + break; + case X86EMUL_OPC_VEX_66(0x0f3a, 0x00): /* vpermq $imm8,ymm/m256,ymm */ case X86EMUL_OPC_VEX_66(0x0f3a, 0x01): /* vpermpd $imm8,ymm/m256,ymm */ generate_exception_if(!vex.l || !vex.w, EXC_UD); --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -237,6 +237,8 @@ XEN_CPUFEATURE(AVX512_BITALG, 6*32+12) / XEN_CPUFEATURE(AVX512_VPOPCNTDQ, 6*32+14) /*A POPCNT for vectors of DW/QW */ XEN_CPUFEATURE(RDPID, 6*32+22) /*A RDPID instruction */ XEN_CPUFEATURE(CLDEMOTE, 6*32+25) /*A CLDEMOTE instruction */ +XEN_CPUFEATURE(MOVDIRI, 6*32+27) /*A MOVDIRI instruction */ +XEN_CPUFEATURE(MOVDIR64B, 6*32+28) /*A MOVDIR64B instruction */ /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */ XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */ --- a/tools/tests/x86_emulator/test_x86_emulator.c +++ b/tools/tests/x86_emulator/test_x86_emulator.c @@ -2196,6 +2196,36 @@ int main(int argc, char **argv) goto fail; printf("okay\n"); + printf("%-40s", "Testing movdiri %edx,(%ecx)..."); + instr[0] = 0x0f; instr[1] = 0x38; instr[2] = 0xf9; instr[3] = 0x11; + regs.eip = (unsigned long)&instr[0]; + regs.ecx = (unsigned long)memset(res, -1, 16); + regs.edx = 0x44332211; + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || + (regs.eip != (unsigned long)&instr[4]) || + res[0] != 0x44332211 || ~res[1] ) + goto fail; + printf("okay\n"); + + printf("%-40s", "Testing movdir64b 144(%edx),%ecx..."); + instr[0] = 0x66; instr[1] = 0x0f; instr[2] = 0x38; instr[3] = 0xf8; + instr[4] = 0x8a; instr[5] = 0x90; instr[8] = instr[7] = instr[6] = 0; + regs.eip = (unsigned long)&instr[0]; + for ( i = 0; i < 64; ++i ) + res[i] = i - 20; + regs.edx = (unsigned long)res; + regs.ecx = (unsigned long)(res + 16); + rc = x86_emulate(&ctxt, &emulops); + if ( (rc != X86EMUL_OKAY) || + (regs.eip != (unsigned long)&instr[9]) || + res[15] != -5 || res[32] != 12 ) + goto fail; + for ( i = 16; i < 32; ++i ) + if ( res[i] != i ) + goto fail; + printf("okay\n"); + printf("%-40s", "Testing movq %mm3,(%ecx)..."); if ( stack_exec && cpu_has_mmx ) { --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -76,6 +76,8 @@ bool emul_test_init(void) cp.feat.adx = true; cp.feat.avx512pf = cp.feat.avx512f; cp.feat.rdpid = true; + cp.feat.movdiri = true; + cp.feat.movdir64b = true; cp.extd.clzero = true; if ( cpu_has_xsave ) @@ -137,15 +139,15 @@ int emul_test_cpuid( res->c |= 1U << 22; /* - * The emulator doesn't itself use ADCX/ADOX/RDPID nor the S/G prefetch - * insns, so we can always run the respective tests. + * The emulator doesn't itself use ADCX/ADOX/RDPID/MOVDIR* nor the S/G + * prefetch insns, so we can always run the respective tests. */ if ( leaf == 7 && subleaf == 0 ) { res->b |= (1U << 10) | (1U << 19); if ( res->b & (1U << 16) ) res->b |= 1U << 26; - res->c |= 1U << 22; + res->c |= (1U << 22) | (1U << 27) | (1U << 28); } /* [-- Attachment #8: Type: text/plain, Size: 157 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2019-09-02 12:04 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-01 11:47 [Xen-devel] [PATCH 0/6] x86emul: further work Jan Beulich 2019-07-01 11:56 ` [Xen-devel] [PATCH 1/6] x86emul: generalize wbinvd() hook Jan Beulich 2019-07-02 10:22 ` Paul Durrant 2019-07-02 10:50 ` Jan Beulich 2019-07-02 10:53 ` Paul Durrant 2019-08-27 10:44 ` Andrew Cooper 2019-08-27 12:51 ` Andrew Cooper 2019-08-27 18:47 ` Andrew Cooper 2019-09-02 11:10 ` [Xen-devel] Ping: " Jan Beulich 2019-09-02 12:04 ` Paul Durrant 2019-07-01 11:56 ` [Xen-devel] [PATCH 2/6] x86emul: support WBNOINVD Jan Beulich 2019-07-02 10:38 ` Paul Durrant 2019-08-27 14:47 ` Andrew Cooper 2019-08-27 15:08 ` Jan Beulich 2019-08-27 16:45 ` Andrew Cooper 2019-08-28 11:42 ` Andrew Cooper 2019-07-01 11:56 ` [Xen-devel] [PATCH 3/6] x86emul: generalize invlpg() hook Jan Beulich 2019-07-02 10:47 ` Paul Durrant 2019-08-27 14:55 ` Andrew Cooper 2019-07-01 11:57 ` [Xen-devel] [PATCH 4/6] x86: move INVPCID_TYPE_* to x86-defns.h Jan Beulich 2019-07-02 10:49 ` Paul Durrant 2019-08-27 14:57 ` Andrew Cooper 2019-07-01 11:57 ` [Xen-devel] [PATCH 5/6] x86emul: support INVPCID Jan Beulich 2019-07-02 12:54 ` Paul Durrant 2019-08-27 15:31 ` Andrew Cooper 2019-08-27 15:53 ` Jan Beulich 2019-08-27 17:27 ` Andrew Cooper 2019-08-28 7:14 ` Jan Beulich 2019-08-28 11:33 ` Andrew Cooper 2019-07-01 11:58 ` [Xen-devel] [PATCH 6/6] x86emul: support MOVDIR{I,64B} insns Jan Beulich 2019-08-27 16:04 ` Andrew Cooper 2019-08-28 6:26 ` Jan Beulich 2019-08-28 11:51 ` Andrew Cooper 2019-08-28 12:19 ` Jan Beulich 2019-08-15 14:24 ` [Xen-devel] [PATCH 0/6] x86emul: further work Andrew Cooper 2019-08-27 8:37 ` Jan Beulich
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).