* [PATCH 0/4 v3] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() @ 2014-06-02 15:50 Mihai Caraman 2014-06-02 15:50 ` [PATCH 1/4 v3] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Mihai Caraman @ 2014-06-02 15:50 UTC (permalink / raw) To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm Read guest last instruction from kvmppc_get_last_inst() allowing the function to fail in order to emulate again. On bookehv architecture search for the physical address and kmap it, instead of using Load External PID (lwepx) instruction. This fixes an infinite loop caused by lwepx's data TLB miss exception handled in the host and the TODO for execute-but-not-read entries and TLB eviction. Mihai Caraman (4): KVM: PPC: e500mc: Revert "add load inst fixup" KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 KVM: PPC: Alow kvmppc_get_last_inst() to fail KVM: PPC: Bookehv: Get vcpu's last instruction for emulation arch/powerpc/include/asm/kvm_book3s.h | 28 ++------ arch/powerpc/include/asm/kvm_booke.h | 7 +- arch/powerpc/include/asm/kvm_ppc.h | 16 +++++ arch/powerpc/include/asm/mmu-book3e.h | 7 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 16 ++--- arch/powerpc/kvm/book3s_paired_singles.c | 38 ++++++---- arch/powerpc/kvm/book3s_pr.c | 116 +++++++++++++++++-------------- arch/powerpc/kvm/booke.c | 35 ++++++++++ arch/powerpc/kvm/bookehv_interrupts.S | 55 ++------------- arch/powerpc/kvm/e500_mmu_host.c | 98 ++++++++++++++++++++++++++ arch/powerpc/kvm/emulate.c | 18 +++-- arch/powerpc/kvm/powerpc.c | 10 ++- 12 files changed, 279 insertions(+), 165 deletions(-) -- 1.7.11.7 ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/4 v3] KVM: PPC: e500mc: Revert "add load inst fixup" 2014-06-02 15:50 [PATCH 0/4 v3] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman @ 2014-06-02 15:50 ` Mihai Caraman 2014-06-02 15:50 ` [PATCH 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Mihai Caraman ` (2 subsequent siblings) 3 siblings, 0 replies; 8+ messages in thread From: Mihai Caraman @ 2014-06-02 15:50 UTC (permalink / raw) To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm The commit 1d628af7 "add load inst fixup" made an attempt to handle failures generated by reading the guest current instruction. The fixup code that was added works by chance hiding the real issue. Load external pid (lwepx) instruction, used by KVM to read guest instructions, is executed in a substituted guest translation context (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage interrupts need to be handled by KVM, even though these interrupts are generated from host context (MSR[GS] = 0) where lwepx is executed. Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. As a result, the host kernel handles lwepx faults searching the faulting guest data address (loaded in DEAR) in its own Logical Partition ID (LPID) 0 context. In case a host translation is found the execution returns to the lwepx instruction instead of the fixup, the host ending up in an infinite loop. Revert the commit "add load inst fixup". lwepx issue will be addressed in a subsequent patch without needing fixup code. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> --- v3: - no change v2: - reworked patch description arch/powerpc/kvm/bookehv_interrupts.S | 26 +------------------------- 1 file changed, 1 insertion(+), 25 deletions(-) diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index a1712b8..6ff4480 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -29,7 +29,6 @@ #include <asm/asm-compat.h> #include <asm/asm-offsets.h> #include <asm/bitsperlong.h> -#include <asm/thread_info.h> #ifdef CONFIG_64BIT #include <asm/exception-64e.h> @@ -164,32 +163,9 @@ PPC_STL r30, VCPU_GPR(R30)(r4) PPC_STL r31, VCPU_GPR(R31)(r4) mtspr SPRN_EPLC, r8 - - /* disable preemption, so we are sure we hit the fixup handler */ - CURRENT_THREAD_INFO(r8, r1) - li r7, 1 - stw r7, TI_PREEMPT(r8) - isync - - /* - * In case the read goes wrong, we catch it and write an invalid value - * in LAST_INST instead. - */ -1: lwepx r9, 0, r5 -2: -.section .fixup, "ax" -3: li r9, KVM_INST_FETCH_FAILED - b 2b -.previous -.section __ex_table,"a" - PPC_LONG_ALIGN - PPC_LONG 1b,3b -.previous - + lwepx r9, 0, r5 mtspr SPRN_EPLC, r3 - li r7, 0 - stw r7, TI_PREEMPT(r8) stw r9, VCPU_LAST_INST(r4) .endif -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 2014-06-02 15:50 [PATCH 0/4 v3] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman 2014-06-02 15:50 ` [PATCH 1/4 v3] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman @ 2014-06-02 15:50 ` Mihai Caraman 2014-06-02 15:50 ` [PATCH 3/4 v3] KVM: PPC: Alow kvmppc_get_last_inst() to fail Mihai Caraman 2014-06-02 15:50 ` [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Mihai Caraman 3 siblings, 0 replies; 8+ messages in thread From: Mihai Caraman @ 2014-06-02 15:50 UTC (permalink / raw) To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm Add mising defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() for Book3E. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> --- v3: - no change v2: - no change arch/powerpc/include/asm/mmu-book3e.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h index 901dac6..60a949a 100644 --- a/arch/powerpc/include/asm/mmu-book3e.h +++ b/arch/powerpc/include/asm/mmu-book3e.h @@ -40,7 +40,11 @@ /* MAS registers bit definitions */ -#define MAS0_TLBSEL(x) (((x) << 28) & 0x30000000) +#define MAS0_TLBSEL_MASK 0x30000000 +#define MAS0_TLBSEL_SHIFT 28 +#define MAS0_TLBSEL(x) (((x) << MAS0_TLBSEL_SHIFT) & MAS0_TLBSEL_MASK) +#define MAS0_GET_TLBSEL(mas0) (((mas0) & MAS0_TLBSEL_MASK) >> \ + MAS0_TLBSEL_SHIFT) #define MAS0_ESEL_MASK 0x0FFF0000 #define MAS0_ESEL_SHIFT 16 #define MAS0_ESEL(x) (((x) << MAS0_ESEL_SHIFT) & MAS0_ESEL_MASK) @@ -58,6 +62,7 @@ #define MAS1_TSIZE_MASK 0x00000f80 #define MAS1_TSIZE_SHIFT 7 #define MAS1_TSIZE(x) (((x) << MAS1_TSIZE_SHIFT) & MAS1_TSIZE_MASK) +#define MAS1_GET_TSIZE(mas1) (((mas1) & MAS1_TSIZE_MASK) >> MAS1_TSIZE_SHIFT) #define MAS2_EPN (~0xFFFUL) #define MAS2_X0 0x00000040 -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/4 v3] KVM: PPC: Alow kvmppc_get_last_inst() to fail 2014-06-02 15:50 [PATCH 0/4 v3] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman 2014-06-02 15:50 ` [PATCH 1/4 v3] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman 2014-06-02 15:50 ` [PATCH 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Mihai Caraman @ 2014-06-02 15:50 ` Mihai Caraman 2014-06-12 15:15 ` Alexander Graf 2014-06-02 15:50 ` [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Mihai Caraman 3 siblings, 1 reply; 8+ messages in thread From: Mihai Caraman @ 2014-06-02 15:50 UTC (permalink / raw) To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> --- v3: - these changes compile on book3s, please validate the functionality and do the necessary adaptations! - rework patch description - add common definition for kvmppc_get_last_inst() - check return values in book3s code v2: - integrated kvmppc_get_last_inst() in book3s code and checked build - addressed cosmetic feedback arch/powerpc/include/asm/kvm_book3s.h | 28 ++------ arch/powerpc/include/asm/kvm_booke.h | 7 +- arch/powerpc/include/asm/kvm_ppc.h | 16 +++++ arch/powerpc/kvm/book3s_64_mmu_hv.c | 16 ++--- arch/powerpc/kvm/book3s_paired_singles.c | 38 ++++++---- arch/powerpc/kvm/book3s_pr.c | 116 +++++++++++++++++-------------- arch/powerpc/kvm/booke.c | 3 + arch/powerpc/kvm/e500_mmu_host.c | 5 ++ arch/powerpc/kvm/emulate.c | 18 +++-- arch/powerpc/kvm/powerpc.c | 10 ++- 10 files changed, 142 insertions(+), 115 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h index f52f656..3409572 100644 --- a/arch/powerpc/include/asm/kvm_book3s.h +++ b/arch/powerpc/include/asm/kvm_book3s.h @@ -274,30 +274,14 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu) return (kvmppc_get_msr(vcpu) & MSR_LE) != (MSR_KERNEL & MSR_LE); } -static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc) +static inline int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, + u32 *inst) { - /* Load the instruction manually if it failed to do so in the - * exit path */ - if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) - kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false); + ulong pc = kvmppc_get_pc(vcpu); - return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) : - vcpu->arch.last_inst; -} - -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu) -{ - return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu)); -} - -/* - * Like kvmppc_get_last_inst(), but for fetching a sc instruction. - * Because the sc instruction sets SRR0 to point to the following - * instruction, we have to fetch from pc - 4. - */ -static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu) -{ - return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4); + if (prev) + pc -= 4; + return kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false); } static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu) diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h index c7aed61..1e28371 100644 --- a/arch/powerpc/include/asm/kvm_booke.h +++ b/arch/powerpc/include/asm/kvm_booke.h @@ -33,6 +33,8 @@ #define KVMPPC_INST_EHPRIV_DEBUG (KVMPPC_INST_EHPRIV | \ (EHPRIV_OC_DEBUG << EHPRIV_OC_SHIFT)) +extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *inst); + static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val) { vcpu->arch.gpr[num] = val; @@ -69,11 +71,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu) return false; } -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu) -{ - return vcpu->arch.last_inst; -} - static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val) { vcpu->arch.ctr = val; diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 4a7cc45..619be2f 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -234,6 +234,22 @@ struct kvmppc_ops { extern struct kvmppc_ops *kvmppc_hv_ops; extern struct kvmppc_ops *kvmppc_pr_ops; +static inline int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, bool prev, + u32 *inst) +{ + int ret = EMULATE_DONE; + + /* Load the instruction manually if it failed to do so in the + * exit path */ + if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) + ret = kvmppc_load_last_inst(vcpu, prev, &vcpu->arch.last_inst); + + *inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) : + vcpu->arch.last_inst; + + return ret; +} + static inline bool is_kvmppc_hv_enabled(struct kvm *kvm) { return kvm->arch.kvm_ops == kvmppc_hv_ops; diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c index 8056107..2ffb3dd 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -530,21 +530,13 @@ static int instruction_is_store(unsigned int instr) static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu, unsigned long gpa, gva_t ea, int is_store) { - int ret; u32 last_inst; - unsigned long srr0 = kvmppc_get_pc(vcpu); - /* We try to load the last instruction. We don't let - * emulate_instruction do it as it doesn't check what - * kvmppc_ld returns. + /* * If we fail, we just return to the guest and try executing it again. */ - if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) { - ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false); - if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED) - return RESUME_GUEST; - vcpu->arch.last_inst = last_inst; - } + if (kvmppc_get_last_inst(vcpu, false, &last_inst) != EMULATE_DONE) + return RESUME_GUEST; /* * WARNING: We do not know for sure whether the instruction we just @@ -558,7 +550,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu, * we just return and retry the instruction. */ - if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store) + if (instruction_is_store(last_inst) != !!is_store) return RESUME_GUEST; /* diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c index 6c8011f..6362ad4 100644 --- a/arch/powerpc/kvm/book3s_paired_singles.c +++ b/arch/powerpc/kvm/book3s_paired_singles.c @@ -639,26 +639,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc, int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu) { - u32 inst = kvmppc_get_last_inst(vcpu); + u32 inst; enum emulation_result emulated = EMULATE_DONE; + int ax_rd, ax_ra, ax_rb, ax_rc; + short full_d; + u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c; - int ax_rd = inst_get_field(inst, 6, 10); - int ax_ra = inst_get_field(inst, 11, 15); - int ax_rb = inst_get_field(inst, 16, 20); - int ax_rc = inst_get_field(inst, 21, 25); - short full_d = inst_get_field(inst, 16, 31); - - u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd); - u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra); - u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb); - u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc); - - bool rcomp = (inst & 1) ? true : false; - u32 cr = kvmppc_get_cr(vcpu); + bool rcomp; + u32 cr; #ifdef DEBUG int i; #endif + emulated = kvmppc_get_last_inst(vcpu, false, &inst); + if (emulated != EMULATE_DONE) + return emulated; + + ax_rd = inst_get_field(inst, 6, 10); + ax_ra = inst_get_field(inst, 11, 15); + ax_rb = inst_get_field(inst, 16, 20); + ax_rc = inst_get_field(inst, 21, 25); + full_d = inst_get_field(inst, 16, 31); + + fpr_d = &VCPU_FPR(vcpu, ax_rd); + fpr_a = &VCPU_FPR(vcpu, ax_ra); + fpr_b = &VCPU_FPR(vcpu, ax_rb); + fpr_c = &VCPU_FPR(vcpu, ax_rc); + + rcomp = (inst & 1) ? true : false; + cr = kvmppc_get_cr(vcpu); + if (!kvmppc_inst_is_paired_single(vcpu, inst)) return EMULATE_FAIL; diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c index 23367a7..48b633c 100644 --- a/arch/powerpc/kvm/book3s_pr.c +++ b/arch/powerpc/kvm/book3s_pr.c @@ -637,42 +637,6 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac) #endif } -static int kvmppc_read_inst(struct kvm_vcpu *vcpu) -{ - ulong srr0 = kvmppc_get_pc(vcpu); - u32 last_inst = kvmppc_get_last_inst(vcpu); - int ret; - - ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false); - if (ret == -ENOENT) { - ulong msr = kvmppc_get_msr(vcpu); - - msr = kvmppc_set_field(msr, 33, 33, 1); - msr = kvmppc_set_field(msr, 34, 36, 0); - msr = kvmppc_set_field(msr, 42, 47, 0); - kvmppc_set_msr_fast(vcpu, msr); - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE); - return EMULATE_AGAIN; - } - - return EMULATE_DONE; -} - -static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr) -{ - - /* Need to do paired single emulation? */ - if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE)) - return EMULATE_DONE; - - /* Read out the instruction */ - if (kvmppc_read_inst(vcpu) == EMULATE_DONE) - /* Need to emulate */ - return EMULATE_FAIL; - - return EMULATE_AGAIN; -} - /* Handle external providers (FPU, Altivec, VSX) */ static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, ulong msr) @@ -977,15 +941,24 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu, { enum emulation_result er; ulong flags; + u32 last_inst; + int emul; program_interrupt: flags = vcpu->arch.shadow_srr1 & 0x1f0000ull; + emul = kvmppc_get_last_inst(vcpu, false, &last_inst); + if (emul != EMULATE_DONE) { + r = RESUME_GUEST; + break; + } + if (kvmppc_get_msr(vcpu) & MSR_PR) { #ifdef EXIT_DEBUG - printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); + pr_info("Userspace triggered 0x700 exception at\n" + "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst); #endif - if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) != + if ((last_inst & 0xff0007ff) != (INS_DCBZ & 0xfffffff7)) { kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; @@ -1004,7 +977,7 @@ program_interrupt: break; case EMULATE_FAIL: printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n", - __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); + __func__, kvmppc_get_pc(vcpu), last_inst); kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; break; @@ -1021,8 +994,25 @@ program_interrupt: break; } case BOOK3S_INTERRUPT_SYSCALL: + { + u32 last_sc; + int emul; + + /* Get last sc for papr */ + if (vcpu->arch.papr_enabled) { + /* + * The sc instuction sets SRR0 to point to the next inst + */ + emul = kvmppc_get_last_inst(vcpu, true, &last_sc); + if (emul != EMULATE_DONE) { + kvmppc_set_pc(vcpu, kvmppc_get_pc(vcpu) - 4); + r = RESUME_GUEST; + break; + } + } + if (vcpu->arch.papr_enabled && - (kvmppc_get_last_sc(vcpu) == 0x44000022) && + (last_sc == 0x44000022) && !(kvmppc_get_msr(vcpu) & MSR_PR)) { /* SC 1 papr hypercalls */ ulong cmd = kvmppc_get_gpr(vcpu, 3); @@ -1067,36 +1057,53 @@ program_interrupt: r = RESUME_GUEST; } break; + } case BOOK3S_INTERRUPT_FP_UNAVAIL: case BOOK3S_INTERRUPT_ALTIVEC: case BOOK3S_INTERRUPT_VSX: { int ext_msr = 0; + int emul; + u32 last_inst; - switch (exit_nr) { - case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP; break; - case BOOK3S_INTERRUPT_ALTIVEC: ext_msr = MSR_VEC; break; - case BOOK3S_INTERRUPT_VSX: ext_msr = MSR_VSX; break; - } + if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE)) { + /* Do paired single emulation */ + + switch (exit_nr) { + case BOOK3S_INTERRUPT_FP_UNAVAIL: + ext_msr = MSR_FP; + break; + + case BOOK3S_INTERRUPT_ALTIVEC: + ext_msr = MSR_VEC; + break; + + case BOOK3S_INTERRUPT_VSX: + ext_msr = MSR_VSX; + break; + } - switch (kvmppc_check_ext(vcpu, exit_nr)) { - case EMULATE_DONE: - /* everything ok - let's enable the ext */ r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr); break; - case EMULATE_FAIL: + } + + emul = kvmppc_get_last_inst(vcpu, false, &last_inst); + if (emul == EMULATE_DONE) { /* we need to emulate this instruction */ goto program_interrupt; break; - default: - /* nothing to worry about - go again */ - break; + } else { + r = RESUME_GUEST; } + break; } case BOOK3S_INTERRUPT_ALIGNMENT: - if (kvmppc_read_inst(vcpu) == EMULATE_DONE) { - u32 last_inst = kvmppc_get_last_inst(vcpu); + { + u32 last_inst; + int emul = kvmppc_get_last_inst(vcpu, false, &last_inst); + + if (emul == EMULATE_DONE) { u32 dsisr; u64 dar; @@ -1110,6 +1117,7 @@ program_interrupt: } r = RESUME_GUEST; break; + } #ifdef CONFIG_PPC_BOOK3S_64 case BOOK3S_INTERRUPT_FAC_UNAVAIL: kvmppc_handle_fac(vcpu, vcpu->arch.shadow_fscr >> 56); diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index ab62109..34a42b9 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) * they were actually modified by emulation. */ return RESUME_GUEST_NV; + case EMULATE_AGAIN: + return RESUME_GUEST; + case EMULATE_DO_DCR: run->exit_reason = KVM_EXIT_DCR; return RESUME_HOST; diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index dd2cc03..f692c12 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -606,6 +606,11 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr, } } +int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *instr) +{ + return EMULATE_FAIL; +} + /************* MMU Notifiers *************/ int kvm_unmap_hva(struct kvm *kvm, unsigned long hva) diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c index da86d9b..c5c64b6 100644 --- a/arch/powerpc/kvm/emulate.c +++ b/arch/powerpc/kvm/emulate.c @@ -224,19 +224,25 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt) * from opcode tables in the future. */ int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu) { - u32 inst = kvmppc_get_last_inst(vcpu); - int ra = get_ra(inst); - int rs = get_rs(inst); - int rt = get_rt(inst); - int sprn = get_sprn(inst); - enum emulation_result emulated = EMULATE_DONE; + u32 inst; + int ra, rs, rt, sprn; + enum emulation_result emulated; int advance = 1; /* this default type might be overwritten by subcategories */ kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS); + emulated = kvmppc_get_last_inst(vcpu, false, &inst); + if (emulated != EMULATE_DONE) + return emulated; + pr_debug("Emulating opcode %d / %d\n", get_op(inst), get_xop(inst)); + ra = get_ra(inst); + rs = get_rs(inst); + rt = get_rt(inst); + sprn = get_sprn(inst); + switch (get_op(inst)) { case OP_TRAP: #ifdef CONFIG_PPC_BOOK3S diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index bab20f4..1dba84b 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -261,6 +261,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) * actually modified. */ r = RESUME_GUEST_NV; break; + case EMULATE_AGAIN: + r = RESUME_GUEST; + break; case EMULATE_DO_MMIO: run->exit_reason = KVM_EXIT_MMIO; /* We must reload nonvolatiles because "update" load/store @@ -270,11 +273,14 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) r = RESUME_HOST_NV; break; case EMULATE_FAIL: + { + u32 last_inst; + kvmppc_get_last_inst(vcpu, false, &last_inst); /* XXX Deliver Program interrupt to guest. */ - printk(KERN_EMERG "%s: emulation failed (%08x)\n", __func__, - kvmppc_get_last_inst(vcpu)); + pr_emerg("%s: emulation failed (%08x)\n", __func__, last_inst); r = RESUME_HOST; break; + } default: WARN_ON(1); r = RESUME_GUEST; -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 3/4 v3] KVM: PPC: Alow kvmppc_get_last_inst() to fail 2014-06-02 15:50 ` [PATCH 3/4 v3] KVM: PPC: Alow kvmppc_get_last_inst() to fail Mihai Caraman @ 2014-06-12 15:15 ` Alexander Graf 0 siblings, 0 replies; 8+ messages in thread From: Alexander Graf @ 2014-06-12 15:15 UTC (permalink / raw) To: Mihai Caraman; +Cc: linuxppc-dev, kvm, kvm-ppc On 06/02/2014 05:50 PM, Mihai Caraman wrote: > On book3e, guest last instruction is read on the exit path using load > external pid (lwepx) dedicated instruction. This load operation may fail > due to TLB eviction and execute-but-not-read entries. > > This patch lay down the path for an alternative solution to read the guest > last instruction, by allowing kvmppc_get_lat_inst() function to fail. > Architecture specific implmentations of kvmppc_load_last_inst() may read > last guest instruction and instruct the emulation layer to re-execute the > guest in case of failure. > > Make kvmppc_get_last_inst() definition common between architectures. > > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> > --- > v3: > - these changes compile on book3s, please validate the functionality and > do the necessary adaptations! > - rework patch description > - add common definition for kvmppc_get_last_inst() > - check return values in book3s code > > v2: > - integrated kvmppc_get_last_inst() in book3s code and checked build > - addressed cosmetic feedback > > arch/powerpc/include/asm/kvm_book3s.h | 28 ++------ > arch/powerpc/include/asm/kvm_booke.h | 7 +- > arch/powerpc/include/asm/kvm_ppc.h | 16 +++++ > arch/powerpc/kvm/book3s_64_mmu_hv.c | 16 ++--- > arch/powerpc/kvm/book3s_paired_singles.c | 38 ++++++---- > arch/powerpc/kvm/book3s_pr.c | 116 +++++++++++++++++-------------- > arch/powerpc/kvm/booke.c | 3 + > arch/powerpc/kvm/e500_mmu_host.c | 5 ++ > arch/powerpc/kvm/emulate.c | 18 +++-- > arch/powerpc/kvm/powerpc.c | 10 ++- > 10 files changed, 142 insertions(+), 115 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h > index f52f656..3409572 100644 > --- a/arch/powerpc/include/asm/kvm_book3s.h > +++ b/arch/powerpc/include/asm/kvm_book3s.h > @@ -274,30 +274,14 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu) > return (kvmppc_get_msr(vcpu) & MSR_LE) != (MSR_KERNEL & MSR_LE); > } > > -static inline u32 kvmppc_get_last_inst_internal(struct kvm_vcpu *vcpu, ulong pc) > +static inline int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, Please make prev an enum :) > + u32 *inst) > { > - /* Load the instruction manually if it failed to do so in the > - * exit path */ > - if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) > - kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false); > + ulong pc = kvmppc_get_pc(vcpu); > > - return kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) : > - vcpu->arch.last_inst; > -} > - > -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu) > -{ > - return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu)); > -} > - > -/* > - * Like kvmppc_get_last_inst(), but for fetching a sc instruction. > - * Because the sc instruction sets SRR0 to point to the following > - * instruction, we have to fetch from pc - 4. > - */ > -static inline u32 kvmppc_get_last_sc(struct kvm_vcpu *vcpu) > -{ > - return kvmppc_get_last_inst_internal(vcpu, kvmppc_get_pc(vcpu) - 4); > + if (prev) > + pc -= 4; > + return kvmppc_ld(vcpu, &pc, sizeof(u32), &vcpu->arch.last_inst, false); > } In this case we're already in the slow path. Can we move this into a .c file instead? That would unify it with booke and ... > > static inline ulong kvmppc_get_fault_dar(struct kvm_vcpu *vcpu) > diff --git a/arch/powerpc/include/asm/kvm_booke.h b/arch/powerpc/include/asm/kvm_booke.h > index c7aed61..1e28371 100644 > --- a/arch/powerpc/include/asm/kvm_booke.h > +++ b/arch/powerpc/include/asm/kvm_booke.h > @@ -33,6 +33,8 @@ > #define KVMPPC_INST_EHPRIV_DEBUG (KVMPPC_INST_EHPRIV | \ > (EHPRIV_OC_DEBUG << EHPRIV_OC_SHIFT)) > > +extern int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *inst); ... allow us to move this to a common header file :). > + > static inline void kvmppc_set_gpr(struct kvm_vcpu *vcpu, int num, ulong val) > { > vcpu->arch.gpr[num] = val; > @@ -69,11 +71,6 @@ static inline bool kvmppc_need_byteswap(struct kvm_vcpu *vcpu) > return false; > } > > -static inline u32 kvmppc_get_last_inst(struct kvm_vcpu *vcpu) > -{ > - return vcpu->arch.last_inst; > -} > - > static inline void kvmppc_set_ctr(struct kvm_vcpu *vcpu, ulong val) > { > vcpu->arch.ctr = val; > diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h > index 4a7cc45..619be2f 100644 > --- a/arch/powerpc/include/asm/kvm_ppc.h > +++ b/arch/powerpc/include/asm/kvm_ppc.h > @@ -234,6 +234,22 @@ struct kvmppc_ops { > extern struct kvmppc_ops *kvmppc_hv_ops; > extern struct kvmppc_ops *kvmppc_pr_ops; > > +static inline int kvmppc_get_last_inst(struct kvm_vcpu *vcpu, bool prev, > + u32 *inst) > +{ > + int ret = EMULATE_DONE; > + > + /* Load the instruction manually if it failed to do so in the > + * exit path */ > + if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) > + ret = kvmppc_load_last_inst(vcpu, prev, &vcpu->arch.last_inst); > + > + *inst = kvmppc_need_byteswap(vcpu) ? swab32(vcpu->arch.last_inst) : > + vcpu->arch.last_inst; > + > + return ret; > +} > + > static inline bool is_kvmppc_hv_enabled(struct kvm *kvm) > { > return kvm->arch.kvm_ops == kvmppc_hv_ops; > diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c > index 8056107..2ffb3dd 100644 > --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c > +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c > @@ -530,21 +530,13 @@ static int instruction_is_store(unsigned int instr) > static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu, > unsigned long gpa, gva_t ea, int is_store) > { > - int ret; > u32 last_inst; > - unsigned long srr0 = kvmppc_get_pc(vcpu); > > - /* We try to load the last instruction. We don't let > - * emulate_instruction do it as it doesn't check what > - * kvmppc_ld returns. > + /* > * If we fail, we just return to the guest and try executing it again. > */ > - if (vcpu->arch.last_inst == KVM_INST_FETCH_FAILED) { > - ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false); > - if (ret != EMULATE_DONE || last_inst == KVM_INST_FETCH_FAILED) > - return RESUME_GUEST; > - vcpu->arch.last_inst = last_inst; > - } > + if (kvmppc_get_last_inst(vcpu, false, &last_inst) != EMULATE_DONE) > + return RESUME_GUEST; > > /* > * WARNING: We do not know for sure whether the instruction we just > @@ -558,7 +550,7 @@ static int kvmppc_hv_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu, > * we just return and retry the instruction. > */ > > - if (instruction_is_store(kvmppc_get_last_inst(vcpu)) != !!is_store) > + if (instruction_is_store(last_inst) != !!is_store) > return RESUME_GUEST; > > /* > diff --git a/arch/powerpc/kvm/book3s_paired_singles.c b/arch/powerpc/kvm/book3s_paired_singles.c > index 6c8011f..6362ad4 100644 > --- a/arch/powerpc/kvm/book3s_paired_singles.c > +++ b/arch/powerpc/kvm/book3s_paired_singles.c > @@ -639,26 +639,36 @@ static int kvmppc_ps_one_in(struct kvm_vcpu *vcpu, bool rc, > > int kvmppc_emulate_paired_single(struct kvm_run *run, struct kvm_vcpu *vcpu) > { > - u32 inst = kvmppc_get_last_inst(vcpu); > + u32 inst; > enum emulation_result emulated = EMULATE_DONE; > + int ax_rd, ax_ra, ax_rb, ax_rc; > + short full_d; > + u64 *fpr_d, *fpr_a, *fpr_b, *fpr_c; > > - int ax_rd = inst_get_field(inst, 6, 10); > - int ax_ra = inst_get_field(inst, 11, 15); > - int ax_rb = inst_get_field(inst, 16, 20); > - int ax_rc = inst_get_field(inst, 21, 25); > - short full_d = inst_get_field(inst, 16, 31); > - > - u64 *fpr_d = &VCPU_FPR(vcpu, ax_rd); > - u64 *fpr_a = &VCPU_FPR(vcpu, ax_ra); > - u64 *fpr_b = &VCPU_FPR(vcpu, ax_rb); > - u64 *fpr_c = &VCPU_FPR(vcpu, ax_rc); > - > - bool rcomp = (inst & 1) ? true : false; > - u32 cr = kvmppc_get_cr(vcpu); > + bool rcomp; > + u32 cr; > #ifdef DEBUG > int i; > #endif > > + emulated = kvmppc_get_last_inst(vcpu, false, &inst); > + if (emulated != EMULATE_DONE) > + return emulated; > + > + ax_rd = inst_get_field(inst, 6, 10); > + ax_ra = inst_get_field(inst, 11, 15); > + ax_rb = inst_get_field(inst, 16, 20); > + ax_rc = inst_get_field(inst, 21, 25); > + full_d = inst_get_field(inst, 16, 31); > + > + fpr_d = &VCPU_FPR(vcpu, ax_rd); > + fpr_a = &VCPU_FPR(vcpu, ax_ra); > + fpr_b = &VCPU_FPR(vcpu, ax_rb); > + fpr_c = &VCPU_FPR(vcpu, ax_rc); > + > + rcomp = (inst & 1) ? true : false; > + cr = kvmppc_get_cr(vcpu); > + > if (!kvmppc_inst_is_paired_single(vcpu, inst)) > return EMULATE_FAIL; > > diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c > index 23367a7..48b633c 100644 > --- a/arch/powerpc/kvm/book3s_pr.c > +++ b/arch/powerpc/kvm/book3s_pr.c > @@ -637,42 +637,6 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac) > #endif > } > > -static int kvmppc_read_inst(struct kvm_vcpu *vcpu) > -{ > - ulong srr0 = kvmppc_get_pc(vcpu); > - u32 last_inst = kvmppc_get_last_inst(vcpu); > - int ret; > - > - ret = kvmppc_ld(vcpu, &srr0, sizeof(u32), &last_inst, false); > - if (ret == -ENOENT) { > - ulong msr = kvmppc_get_msr(vcpu); > - > - msr = kvmppc_set_field(msr, 33, 33, 1); > - msr = kvmppc_set_field(msr, 34, 36, 0); > - msr = kvmppc_set_field(msr, 42, 47, 0); > - kvmppc_set_msr_fast(vcpu, msr); > - kvmppc_book3s_queue_irqprio(vcpu, BOOK3S_INTERRUPT_INST_STORAGE); > - return EMULATE_AGAIN; > - } > - > - return EMULATE_DONE; > -} > - > -static int kvmppc_check_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr) > -{ > - > - /* Need to do paired single emulation? */ > - if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE)) > - return EMULATE_DONE; > - > - /* Read out the instruction */ > - if (kvmppc_read_inst(vcpu) == EMULATE_DONE) > - /* Need to emulate */ > - return EMULATE_FAIL; > - > - return EMULATE_AGAIN; > -} > - > /* Handle external providers (FPU, Altivec, VSX) */ > static int kvmppc_handle_ext(struct kvm_vcpu *vcpu, unsigned int exit_nr, > ulong msr) > @@ -977,15 +941,24 @@ int kvmppc_handle_exit_pr(struct kvm_run *run, struct kvm_vcpu *vcpu, > { > enum emulation_result er; > ulong flags; > + u32 last_inst; > + int emul; > > program_interrupt: > flags = vcpu->arch.shadow_srr1 & 0x1f0000ull; > > + emul = kvmppc_get_last_inst(vcpu, false, &last_inst); > + if (emul != EMULATE_DONE) { > + r = RESUME_GUEST; > + break; > + } > + > if (kvmppc_get_msr(vcpu) & MSR_PR) { > #ifdef EXIT_DEBUG > - printk(KERN_INFO "Userspace triggered 0x700 exception at 0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); > + pr_info("Userspace triggered 0x700 exception at\n" I don't think we want the \n here :). > + "0x%lx (0x%x)\n", kvmppc_get_pc(vcpu), last_inst); > #endif > - if ((kvmppc_get_last_inst(vcpu) & 0xff0007ff) != > + if ((last_inst & 0xff0007ff) != > (INS_DCBZ & 0xfffffff7)) { > kvmppc_core_queue_program(vcpu, flags); > r = RESUME_GUEST; > @@ -1004,7 +977,7 @@ program_interrupt: > break; > case EMULATE_FAIL: > printk(KERN_CRIT "%s: emulation at %lx failed (%08x)\n", > - __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); > + __func__, kvmppc_get_pc(vcpu), last_inst); > kvmppc_core_queue_program(vcpu, flags); > r = RESUME_GUEST; > break; > @@ -1021,8 +994,25 @@ program_interrupt: > break; > } > case BOOK3S_INTERRUPT_SYSCALL: > + { > + u32 last_sc; > + int emul; > + > + /* Get last sc for papr */ > + if (vcpu->arch.papr_enabled) { > + /* > + * The sc instuction sets SRR0 to point to the next inst > + */ > + emul = kvmppc_get_last_inst(vcpu, true, &last_sc); > + if (emul != EMULATE_DONE) { > + kvmppc_set_pc(vcpu, kvmppc_get_pc(vcpu) - 4); > + r = RESUME_GUEST; > + break; > + } > + } > + > if (vcpu->arch.papr_enabled && > - (kvmppc_get_last_sc(vcpu) == 0x44000022) && > + (last_sc == 0x44000022) && > !(kvmppc_get_msr(vcpu) & MSR_PR)) { > /* SC 1 papr hypercalls */ > ulong cmd = kvmppc_get_gpr(vcpu, 3); > @@ -1067,36 +1057,53 @@ program_interrupt: > r = RESUME_GUEST; > } > break; > + } > case BOOK3S_INTERRUPT_FP_UNAVAIL: > case BOOK3S_INTERRUPT_ALTIVEC: > case BOOK3S_INTERRUPT_VSX: > { > int ext_msr = 0; > + int emul; > + u32 last_inst; > > - switch (exit_nr) { > - case BOOK3S_INTERRUPT_FP_UNAVAIL: ext_msr = MSR_FP; break; > - case BOOK3S_INTERRUPT_ALTIVEC: ext_msr = MSR_VEC; break; > - case BOOK3S_INTERRUPT_VSX: ext_msr = MSR_VSX; break; > - } > + if (!(vcpu->arch.hflags & BOOK3S_HFLAG_PAIRED_SINGLE)) { > + /* Do paired single emulation */ > + > + switch (exit_nr) { > + case BOOK3S_INTERRUPT_FP_UNAVAIL: > + ext_msr = MSR_FP; > + break; > + > + case BOOK3S_INTERRUPT_ALTIVEC: > + ext_msr = MSR_VEC; > + break; > + > + case BOOK3S_INTERRUPT_VSX: > + ext_msr = MSR_VSX; > + break; > + } > > - switch (kvmppc_check_ext(vcpu, exit_nr)) { > - case EMULATE_DONE: > - /* everything ok - let's enable the ext */ > r = kvmppc_handle_ext(vcpu, exit_nr, ext_msr); > break; > - case EMULATE_FAIL: > + } > + > + emul = kvmppc_get_last_inst(vcpu, false, &last_inst); > + if (emul == EMULATE_DONE) { > /* we need to emulate this instruction */ > goto program_interrupt; > break; > - default: > - /* nothing to worry about - go again */ > - break; > + } else { > + r = RESUME_GUEST; > } > + > break; > } > case BOOK3S_INTERRUPT_ALIGNMENT: > - if (kvmppc_read_inst(vcpu) == EMULATE_DONE) { Phew - this removes the ability to inject an instruction page fault on this path. It's probably ok to do so, but please eliminate kvmppc_read_inst() in a separate patch before this one so we can later bisect it more easily when it fails. Alex > - u32 last_inst = kvmppc_get_last_inst(vcpu); > + { > + u32 last_inst; > + int emul = kvmppc_get_last_inst(vcpu, false, &last_inst); > + > + if (emul == EMULATE_DONE) { > u32 dsisr; > u64 dar; > > @@ -1110,6 +1117,7 @@ program_interrupt: > } > r = RESUME_GUEST; > break; > + } > #ifdef CONFIG_PPC_BOOK3S_64 > case BOOK3S_INTERRUPT_FAC_UNAVAIL: > kvmppc_handle_fac(vcpu, vcpu->arch.shadow_fscr >> 56); > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c > index ab62109..34a42b9 100644 > --- a/arch/powerpc/kvm/booke.c > +++ b/arch/powerpc/kvm/booke.c > @@ -752,6 +752,9 @@ static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu) > * they were actually modified by emulation. */ > return RESUME_GUEST_NV; > > + case EMULATE_AGAIN: > + return RESUME_GUEST; > + > case EMULATE_DO_DCR: > run->exit_reason = KVM_EXIT_DCR; > return RESUME_HOST; > diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c > index dd2cc03..f692c12 100644 > --- a/arch/powerpc/kvm/e500_mmu_host.c > +++ b/arch/powerpc/kvm/e500_mmu_host.c > @@ -606,6 +606,11 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr, > } > } > > +int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *instr) > +{ > + return EMULATE_FAIL; > +} > + > /************* MMU Notifiers *************/ > > int kvm_unmap_hva(struct kvm *kvm, unsigned long hva) > diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c > index da86d9b..c5c64b6 100644 > --- a/arch/powerpc/kvm/emulate.c > +++ b/arch/powerpc/kvm/emulate.c > @@ -224,19 +224,25 @@ static int kvmppc_emulate_mfspr(struct kvm_vcpu *vcpu, int sprn, int rt) > * from opcode tables in the future. */ > int kvmppc_emulate_instruction(struct kvm_run *run, struct kvm_vcpu *vcpu) > { > - u32 inst = kvmppc_get_last_inst(vcpu); > - int ra = get_ra(inst); > - int rs = get_rs(inst); > - int rt = get_rt(inst); > - int sprn = get_sprn(inst); > - enum emulation_result emulated = EMULATE_DONE; > + u32 inst; > + int ra, rs, rt, sprn; > + enum emulation_result emulated; > int advance = 1; > > /* this default type might be overwritten by subcategories */ > kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS); > > + emulated = kvmppc_get_last_inst(vcpu, false, &inst); > + if (emulated != EMULATE_DONE) > + return emulated; > + > pr_debug("Emulating opcode %d / %d\n", get_op(inst), get_xop(inst)); > > + ra = get_ra(inst); > + rs = get_rs(inst); > + rt = get_rt(inst); > + sprn = get_sprn(inst); > + > switch (get_op(inst)) { > case OP_TRAP: > #ifdef CONFIG_PPC_BOOK3S > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > index bab20f4..1dba84b 100644 > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -261,6 +261,9 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) > * actually modified. */ > r = RESUME_GUEST_NV; > break; > + case EMULATE_AGAIN: > + r = RESUME_GUEST; > + break; > case EMULATE_DO_MMIO: > run->exit_reason = KVM_EXIT_MMIO; > /* We must reload nonvolatiles because "update" load/store > @@ -270,11 +273,14 @@ int kvmppc_emulate_mmio(struct kvm_run *run, struct kvm_vcpu *vcpu) > r = RESUME_HOST_NV; > break; > case EMULATE_FAIL: > + { > + u32 last_inst; > + kvmppc_get_last_inst(vcpu, false, &last_inst); > /* XXX Deliver Program interrupt to guest. */ > - printk(KERN_EMERG "%s: emulation failed (%08x)\n", __func__, > - kvmppc_get_last_inst(vcpu)); > + pr_emerg("%s: emulation failed (%08x)\n", __func__, last_inst); > r = RESUME_HOST; > break; > + } > default: > WARN_ON(1); > r = RESUME_GUEST; ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation 2014-06-02 15:50 [PATCH 0/4 v3] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman ` (2 preceding siblings ...) 2014-06-02 15:50 ` [PATCH 3/4 v3] KVM: PPC: Alow kvmppc_get_last_inst() to fail Mihai Caraman @ 2014-06-02 15:50 ` Mihai Caraman 2014-06-12 16:04 ` Alexander Graf 3 siblings, 1 reply; 8+ messages in thread From: Mihai Caraman @ 2014-06-02 15:50 UTC (permalink / raw) To: kvm-ppc; +Cc: Mihai Caraman, linuxppc-dev, kvm On book3e, KVM uses load external pid (lwepx) dedicated instruction to read guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI and LRAT), generated by loading a guest address, needs to be handled by KVM. These exceptions are generated in a substituted guest translation context (EPLC[EGS] = 1) from host context (MSR[GS] = 0). Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. lwepx exceptions originate from host state (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious too intrusive for the host. Read guest last instruction from kvmppc_load_last_inst() by searching for the physical address and kmap it. This address the TODO for TLB eviction and execute-but-not-read entries, and allow us to get rid of lwepx until we are able to handle failures. A simple stress benchmark shows a 1% sys performance degradation compared with previous approach (lwepx without failure handling): time for i in `seq 1 10000`; do /bin/echo > /dev/null; done real 0m 8.85s user 0m 4.34s sys 0m 4.48s vs real 0m 8.84s user 0m 4.36s sys 0m 4.44s An alternative solution, to handle lwepx exceptions in KVM, is to temporary highjack the interrupt vector from host. Some cores share host IVOR registers between hardware threads, which is the case of FSL e6500, which impose additional synchronization logic for this solution to work. This optimized solution can be developed later on top of this patch. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> --- v3: - reworked patch description - use unaltered kmap addr for kunmap - get last instruction before beeing preempted v2: - reworked patch description - used pr_* functions - addressed cosmetic feedback arch/powerpc/kvm/booke.c | 32 ++++++++++++ arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++---------- arch/powerpc/kvm/e500_mmu_host.c | 93 +++++++++++++++++++++++++++++++++++ 3 files changed, 134 insertions(+), 28 deletions(-) diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c index 34a42b9..4ef52a8 100644 --- a/arch/powerpc/kvm/booke.c +++ b/arch/powerpc/kvm/booke.c @@ -880,6 +880,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, int r = RESUME_HOST; int s; int idx; + u32 last_inst = KVM_INST_FETCH_FAILED; + enum emulation_result emulated = EMULATE_DONE; /* update before a new last_exit_type is rewritten */ kvmppc_update_timing_stats(vcpu); @@ -887,6 +889,15 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, /* restart interrupts if they were meant for the host */ kvmppc_restart_interrupt(vcpu, exit_nr); + /* + * get last instruction before beeing preempted + * TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA + */ + if (exit_nr == BOOKE_INTERRUPT_DATA_STORAGE || + exit_nr == BOOKE_INTERRUPT_DTLB_MISS || + exit_nr == BOOKE_INTERRUPT_HV_PRIV) + emulated = kvmppc_get_last_inst(vcpu, false, &last_inst); + local_irq_enable(); trace_kvm_exit(exit_nr, vcpu); @@ -895,6 +906,26 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, run->exit_reason = KVM_EXIT_UNKNOWN; run->ready_for_interrupt_injection = 1; + switch (emulated) { + case EMULATE_AGAIN: + r = RESUME_GUEST; + goto out; + + case EMULATE_FAIL: + pr_debug("%s: emulation at %lx failed (%08x)\n", + __func__, vcpu->arch.pc, last_inst); + /* For debugging, encode the failing instruction and + * report it to userspace. */ + run->hw.hardware_exit_reason = ~0ULL << 32; + run->hw.hardware_exit_reason |= last_inst; + kvmppc_core_queue_program(vcpu, ESR_PIL); + r = RESUME_HOST; + goto out; + + default: + break; + } + switch (exit_nr) { case BOOKE_INTERRUPT_MACHINE_CHECK: printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR)); @@ -1184,6 +1215,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, BUG(); } +out: /* * To avoid clobbering exit_reason, only check for signals if we * aren't already exiting to userspace for some other reason. diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S index 6ff4480..e000b39 100644 --- a/arch/powerpc/kvm/bookehv_interrupts.S +++ b/arch/powerpc/kvm/bookehv_interrupts.S @@ -121,38 +121,14 @@ 1: .if \flags & NEED_EMU - /* - * This assumes you have external PID support. - * To support a bookehv CPU without external PID, you'll - * need to look up the TLB entry and create a temporary mapping. - * - * FIXME: we don't currently handle if the lwepx faults. PR-mode - * booke doesn't handle it either. Since Linux doesn't use - * broadcast tlbivax anymore, the only way this should happen is - * if the guest maps its memory execute-but-not-read, or if we - * somehow take a TLB miss in the middle of this entry code and - * evict the relevant entry. On e500mc, all kernel lowmem is - * bolted into TLB1 large page mappings, and we don't use - * broadcast invalidates, so we should not take a TLB miss here. - * - * Later we'll need to deal with faults here. Disallowing guest - * mappings that are execute-but-not-read could be an option on - * e500mc, but not on chips with an LRAT if it is used. - */ - - mfspr r3, SPRN_EPLC /* will already have correct ELPID and EGS */ PPC_STL r15, VCPU_GPR(R15)(r4) PPC_STL r16, VCPU_GPR(R16)(r4) PPC_STL r17, VCPU_GPR(R17)(r4) PPC_STL r18, VCPU_GPR(R18)(r4) PPC_STL r19, VCPU_GPR(R19)(r4) - mr r8, r3 PPC_STL r20, VCPU_GPR(R20)(r4) - rlwimi r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS PPC_STL r21, VCPU_GPR(R21)(r4) - rlwimi r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR PPC_STL r22, VCPU_GPR(R22)(r4) - rlwimi r8, r10, EPC_EPID_SHIFT, EPC_EPID PPC_STL r23, VCPU_GPR(R23)(r4) PPC_STL r24, VCPU_GPR(R24)(r4) PPC_STL r25, VCPU_GPR(R25)(r4) @@ -162,10 +138,15 @@ PPC_STL r29, VCPU_GPR(R29)(r4) PPC_STL r30, VCPU_GPR(R30)(r4) PPC_STL r31, VCPU_GPR(R31)(r4) - mtspr SPRN_EPLC, r8 - isync - lwepx r9, 0, r5 - mtspr SPRN_EPLC, r3 + + /* + * We don't use external PID support. lwepx faults would need to be + * handled by KVM and this implies aditional code in DO_KVM (for + * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which + * is too intrusive for the host. Get last instuction in + * kvmppc_get_last_inst(). + */ + li r9, KVM_INST_FETCH_FAILED stw r9, VCPU_LAST_INST(r4) .endif diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index f692c12..0528fe5 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -606,10 +606,103 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr, } } +#ifdef CONFIG_KVM_BOOKE_HV int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *instr) { + gva_t geaddr; + hpa_t addr; + hfn_t pfn; + hva_t eaddr; + u32 mas0, mas1, mas2, mas3; + u64 mas7_mas3; + struct page *page; + unsigned int addr_space, psize_shift; + bool pr; + unsigned long flags; + + WARN_ON_ONCE(prev); + + /* Search TLB for guest pc to get the real address */ + geaddr = kvmppc_get_pc(vcpu); + + addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG; + + local_irq_save(flags); + mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space); + mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid); + asm volatile("tlbsx 0, %[geaddr]\n" : : + [geaddr] "r" (geaddr)); + mtspr(SPRN_MAS5, 0); + mtspr(SPRN_MAS8, 0); + mas0 = mfspr(SPRN_MAS0); + mas1 = mfspr(SPRN_MAS1); + mas2 = mfspr(SPRN_MAS2); + mas3 = mfspr(SPRN_MAS3); + mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3; + local_irq_restore(flags); + + /* + * If the TLB entry for guest pc was evicted, return to the guest. + * There are high chances to find a valid TLB entry next time. + */ + if (!(mas1 & MAS1_VALID)) + return EMULATE_AGAIN; + + /* + * Another thread may rewrite the TLB entry in parallel, don't + * execute from the address if the execute permission is not set + */ + pr = vcpu->arch.shared->msr & MSR_PR; + if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) { + pr_debug("Instuction emulation from a guest page\n" + "withot execute permission\n"); + return EMULATE_FAIL; + } + + /* + * We will map the real address through a cacheable page, so we will + * not support cache-inhibited guest pages. Fortunately emulated + * instructions should not live there. + */ + if (mas2 & MAS2_I) { + pr_debug("Instuction emulation from cache-inhibited\n" + "guest pages is not supported\n"); + return EMULATE_FAIL; + } + + /* Get page size */ + psize_shift = MAS1_GET_TSIZE(mas1) + 10; + + /* Map a page and get guest's instruction */ + addr = (mas7_mas3 & (~0ULL << psize_shift)) | + (geaddr & ((1ULL << psize_shift) - 1ULL)); + pfn = addr >> PAGE_SHIFT; + + /* Guard us against emulation from devices area */ + if (unlikely(!page_is_ram(pfn))) { + pr_debug("Instruction emulation from non-RAM host\n" + "pages is not supported\n"); + return EMULATE_FAIL; + } + + if (unlikely(!pfn_valid(pfn))) { + pr_debug("Invalid frame number\n"); + return EMULATE_FAIL; + } + + page = pfn_to_page(pfn); + eaddr = (unsigned long)kmap_atomic(page); + *instr = *(u32 *)(eaddr | (addr & ~PAGE_MASK)); + kunmap_atomic((u32 *)eaddr); + + return EMULATE_DONE; +} +#else +int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, u32 *instr) +{ return EMULATE_FAIL; } +#endif /************* MMU Notifiers *************/ -- 1.7.11.7 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation 2014-06-02 15:50 ` [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Mihai Caraman @ 2014-06-12 16:04 ` Alexander Graf 2014-06-13 21:56 ` Scott Wood 0 siblings, 1 reply; 8+ messages in thread From: Alexander Graf @ 2014-06-12 16:04 UTC (permalink / raw) To: Mihai Caraman; +Cc: Scott Wood, linuxppc-dev, kvm, kvm-ppc On 06/02/2014 05:50 PM, Mihai Caraman wrote: > On book3e, KVM uses load external pid (lwepx) dedicated instruction to read > guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI > and LRAT), generated by loading a guest address, needs to be handled by KVM. > These exceptions are generated in a substituted guest translation context > (EPLC[EGS] = 1) from host context (MSR[GS] = 0). > > Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), > doing minimal checks on the fast path to avoid host performance degradation. > lwepx exceptions originate from host state (MSR[GS] = 0) which implies > additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking > at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context > Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious > too intrusive for the host. > > Read guest last instruction from kvmppc_load_last_inst() by searching for the > physical address and kmap it. This address the TODO for TLB eviction and > execute-but-not-read entries, and allow us to get rid of lwepx until we are > able to handle failures. > > A simple stress benchmark shows a 1% sys performance degradation compared with > previous approach (lwepx without failure handling): > > time for i in `seq 1 10000`; do /bin/echo > /dev/null; done > > real 0m 8.85s > user 0m 4.34s > sys 0m 4.48s > > vs > > real 0m 8.84s > user 0m 4.36s > sys 0m 4.44s > > An alternative solution, to handle lwepx exceptions in KVM, is to temporary > highjack the interrupt vector from host. Some cores share host IVOR registers > between hardware threads, which is the case of FSL e6500, which impose additional > synchronization logic for this solution to work. This optimized solution can > be developed later on top of this patch. > > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> > --- > v3: > - reworked patch description > - use unaltered kmap addr for kunmap > - get last instruction before beeing preempted > > v2: > - reworked patch description > - used pr_* functions > - addressed cosmetic feedback > > arch/powerpc/kvm/booke.c | 32 ++++++++++++ > arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++---------- > arch/powerpc/kvm/e500_mmu_host.c | 93 +++++++++++++++++++++++++++++++++++ > 3 files changed, 134 insertions(+), 28 deletions(-) > > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c > index 34a42b9..4ef52a8 100644 > --- a/arch/powerpc/kvm/booke.c > +++ b/arch/powerpc/kvm/booke.c > @@ -880,6 +880,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > int r = RESUME_HOST; > int s; > int idx; > + u32 last_inst = KVM_INST_FETCH_FAILED; > + enum emulation_result emulated = EMULATE_DONE; > > /* update before a new last_exit_type is rewritten */ > kvmppc_update_timing_stats(vcpu); > @@ -887,6 +889,15 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > /* restart interrupts if they were meant for the host */ > kvmppc_restart_interrupt(vcpu, exit_nr); > > + /* > + * get last instruction before beeing preempted > + * TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA > + */ > + if (exit_nr == BOOKE_INTERRUPT_DATA_STORAGE || > + exit_nr == BOOKE_INTERRUPT_DTLB_MISS || > + exit_nr == BOOKE_INTERRUPT_HV_PRIV) Please make this a switch() - that's easier to read. > + emulated = kvmppc_get_last_inst(vcpu, false, &last_inst); > + > local_irq_enable(); > > trace_kvm_exit(exit_nr, vcpu); > @@ -895,6 +906,26 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > run->exit_reason = KVM_EXIT_UNKNOWN; > run->ready_for_interrupt_injection = 1; > > + switch (emulated) { > + case EMULATE_AGAIN: > + r = RESUME_GUEST; > + goto out; > + > + case EMULATE_FAIL: > + pr_debug("%s: emulation at %lx failed (%08x)\n", > + __func__, vcpu->arch.pc, last_inst); > + /* For debugging, encode the failing instruction and > + * report it to userspace. */ > + run->hw.hardware_exit_reason = ~0ULL << 32; > + run->hw.hardware_exit_reason |= last_inst; > + kvmppc_core_queue_program(vcpu, ESR_PIL); > + r = RESUME_HOST; > + goto out; > + > + default: > + break; > + } I think you can just put this into a function. Scott, I think the patch overall looks quite good. Can you please check as well and if you agree give it your reviewed-by? Mike, when Scott gives you a reviewed-by, please include it for the next version. Alex > + > switch (exit_nr) { > case BOOKE_INTERRUPT_MACHINE_CHECK: > printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR)); > @@ -1184,6 +1215,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > BUG(); > } > > +out: > /* > * To avoid clobbering exit_reason, only check for signals if we > * aren't already exiting to userspace for some other reason. > diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S > index 6ff4480..e000b39 100644 > --- a/arch/powerpc/kvm/bookehv_interrupts.S > +++ b/arch/powerpc/kvm/bookehv_interrupts.S > @@ -121,38 +121,14 @@ > 1: > > .if \flags & NEED_EMU > - /* > - * This assumes you have external PID support. > - * To support a bookehv CPU without external PID, you'll > - * need to look up the TLB entry and create a temporary mapping. > - * > - * FIXME: we don't currently handle if the lwepx faults. PR-mode > - * booke doesn't handle it either. Since Linux doesn't use > - * broadcast tlbivax anymore, the only way this should happen is > - * if the guest maps its memory execute-but-not-read, or if we > - * somehow take a TLB miss in the middle of this entry code and > - * evict the relevant entry. On e500mc, all kernel lowmem is > - * bolted into TLB1 large page mappings, and we don't use > - * broadcast invalidates, so we should not take a TLB miss here. > - * > - * Later we'll need to deal with faults here. Disallowing guest > - * mappings that are execute-but-not-read could be an option on > - * e500mc, but not on chips with an LRAT if it is used. > - */ > - > - mfspr r3, SPRN_EPLC /* will already have correct ELPID and EGS */ > PPC_STL r15, VCPU_GPR(R15)(r4) > PPC_STL r16, VCPU_GPR(R16)(r4) > PPC_STL r17, VCPU_GPR(R17)(r4) > PPC_STL r18, VCPU_GPR(R18)(r4) > PPC_STL r19, VCPU_GPR(R19)(r4) > - mr r8, r3 > PPC_STL r20, VCPU_GPR(R20)(r4) > - rlwimi r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS > PPC_STL r21, VCPU_GPR(R21)(r4) > - rlwimi r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR > PPC_STL r22, VCPU_GPR(R22)(r4) > - rlwimi r8, r10, EPC_EPID_SHIFT, EPC_EPID > PPC_STL r23, VCPU_GPR(R23)(r4) > PPC_STL r24, VCPU_GPR(R24)(r4) > PPC_STL r25, VCPU_GPR(R25)(r4) > @@ -162,10 +138,15 @@ > PPC_STL r29, VCPU_GPR(R29)(r4) > PPC_STL r30, VCPU_GPR(R30)(r4) > PPC_STL r31, VCPU_GPR(R31)(r4) > - mtspr SPRN_EPLC, r8 > - isync > - lwepx r9, 0, r5 > - mtspr SPRN_EPLC, r3 > + > + /* > + * We don't use external PID support. lwepx faults would need to be > + * handled by KVM and this implies aditional code in DO_KVM (for > + * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which > + * is too intrusive for the host. Get last instuction in > + * kvmppc_get_last_inst(). > + */ > + li r9, KVM_INST_FETCH_FAILED > stw r9, VCPU_LAST_INST(r4) > .endif > > diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c > index f692c12..0528fe5 100644 > --- a/arch/powerpc/kvm/e500_mmu_host.c > +++ b/arch/powerpc/kvm/e500_mmu_host.c > @@ -606,10 +606,103 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr, > } > } > > +#ifdef CONFIG_KVM_BOOKE_HV > int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *instr) > { > + gva_t geaddr; > + hpa_t addr; > + hfn_t pfn; > + hva_t eaddr; > + u32 mas0, mas1, mas2, mas3; > + u64 mas7_mas3; > + struct page *page; > + unsigned int addr_space, psize_shift; > + bool pr; > + unsigned long flags; > + > + WARN_ON_ONCE(prev); > + > + /* Search TLB for guest pc to get the real address */ > + geaddr = kvmppc_get_pc(vcpu); > + > + addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG; > + > + local_irq_save(flags); > + mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space); > + mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid); > + asm volatile("tlbsx 0, %[geaddr]\n" : : > + [geaddr] "r" (geaddr)); > + mtspr(SPRN_MAS5, 0); > + mtspr(SPRN_MAS8, 0); > + mas0 = mfspr(SPRN_MAS0); > + mas1 = mfspr(SPRN_MAS1); > + mas2 = mfspr(SPRN_MAS2); > + mas3 = mfspr(SPRN_MAS3); > + mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3; > + local_irq_restore(flags); > + > + /* > + * If the TLB entry for guest pc was evicted, return to the guest. > + * There are high chances to find a valid TLB entry next time. > + */ > + if (!(mas1 & MAS1_VALID)) > + return EMULATE_AGAIN; > + > + /* > + * Another thread may rewrite the TLB entry in parallel, don't > + * execute from the address if the execute permission is not set > + */ > + pr = vcpu->arch.shared->msr & MSR_PR; > + if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) { > + pr_debug("Instuction emulation from a guest page\n" > + "withot execute permission\n"); > + return EMULATE_FAIL; > + } > + > + /* > + * We will map the real address through a cacheable page, so we will > + * not support cache-inhibited guest pages. Fortunately emulated > + * instructions should not live there. > + */ > + if (mas2 & MAS2_I) { > + pr_debug("Instuction emulation from cache-inhibited\n" > + "guest pages is not supported\n"); > + return EMULATE_FAIL; > + } > + > + /* Get page size */ > + psize_shift = MAS1_GET_TSIZE(mas1) + 10; > + > + /* Map a page and get guest's instruction */ > + addr = (mas7_mas3 & (~0ULL << psize_shift)) | > + (geaddr & ((1ULL << psize_shift) - 1ULL)); > + pfn = addr >> PAGE_SHIFT; > + > + /* Guard us against emulation from devices area */ > + if (unlikely(!page_is_ram(pfn))) { > + pr_debug("Instruction emulation from non-RAM host\n" > + "pages is not supported\n"); > + return EMULATE_FAIL; > + } > + > + if (unlikely(!pfn_valid(pfn))) { > + pr_debug("Invalid frame number\n"); > + return EMULATE_FAIL; > + } > + > + page = pfn_to_page(pfn); > + eaddr = (unsigned long)kmap_atomic(page); > + *instr = *(u32 *)(eaddr | (addr & ~PAGE_MASK)); > + kunmap_atomic((u32 *)eaddr); > + > + return EMULATE_DONE; > +} > +#else > +int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, u32 *instr) > +{ > return EMULATE_FAIL; > } > +#endif > > /************* MMU Notifiers *************/ > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation 2014-06-12 16:04 ` Alexander Graf @ 2014-06-13 21:56 ` Scott Wood 0 siblings, 0 replies; 8+ messages in thread From: Scott Wood @ 2014-06-13 21:56 UTC (permalink / raw) To: Alexander Graf; +Cc: Mihai Caraman, linuxppc-dev, kvm, kvm-ppc On Thu, 2014-06-12 at 18:04 +0200, Alexander Graf wrote: > On 06/02/2014 05:50 PM, Mihai Caraman wrote: > > On book3e, KVM uses load external pid (lwepx) dedicated instruction to read > > guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI > > and LRAT), generated by loading a guest address, needs to be handled by KVM. > > These exceptions are generated in a substituted guest translation context > > (EPLC[EGS] = 1) from host context (MSR[GS] = 0). > > > > Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), > > doing minimal checks on the fast path to avoid host performance degradation. > > lwepx exceptions originate from host state (MSR[GS] = 0) which implies > > additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking > > at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context > > Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious > > too intrusive for the host. > > > > Read guest last instruction from kvmppc_load_last_inst() by searching for the > > physical address and kmap it. This address the TODO for TLB eviction and > > execute-but-not-read entries, and allow us to get rid of lwepx until we are > > able to handle failures. > > > > A simple stress benchmark shows a 1% sys performance degradation compared with > > previous approach (lwepx without failure handling): > > > > time for i in `seq 1 10000`; do /bin/echo > /dev/null; done > > > > real 0m 8.85s > > user 0m 4.34s > > sys 0m 4.48s > > > > vs > > > > real 0m 8.84s > > user 0m 4.36s > > sys 0m 4.44s > > > > An alternative solution, to handle lwepx exceptions in KVM, is to temporary > > highjack the interrupt vector from host. Some cores share host IVOR registers > > between hardware threads, which is the case of FSL e6500, which impose additional > > synchronization logic for this solution to work. This optimized solution can > > be developed later on top of this patch. > > > > Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> > > --- > > v3: > > - reworked patch description > > - use unaltered kmap addr for kunmap > > - get last instruction before beeing preempted > > > > v2: > > - reworked patch description > > - used pr_* functions > > - addressed cosmetic feedback > > > > arch/powerpc/kvm/booke.c | 32 ++++++++++++ > > arch/powerpc/kvm/bookehv_interrupts.S | 37 ++++---------- > > arch/powerpc/kvm/e500_mmu_host.c | 93 +++++++++++++++++++++++++++++++++++ > > 3 files changed, 134 insertions(+), 28 deletions(-) > > > > diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c > > index 34a42b9..4ef52a8 100644 > > --- a/arch/powerpc/kvm/booke.c > > +++ b/arch/powerpc/kvm/booke.c > > @@ -880,6 +880,8 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > > int r = RESUME_HOST; > > int s; > > int idx; > > + u32 last_inst = KVM_INST_FETCH_FAILED; > > + enum emulation_result emulated = EMULATE_DONE; > > > > /* update before a new last_exit_type is rewritten */ > > kvmppc_update_timing_stats(vcpu); > > @@ -887,6 +889,15 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > > /* restart interrupts if they were meant for the host */ > > kvmppc_restart_interrupt(vcpu, exit_nr); > > > > + /* > > + * get last instruction before beeing preempted > > + * TODO: for e6500 check also BOOKE_INTERRUPT_LRAT_ERROR & ESR_DATA > > + */ > > + if (exit_nr == BOOKE_INTERRUPT_DATA_STORAGE || > > + exit_nr == BOOKE_INTERRUPT_DTLB_MISS || > > + exit_nr == BOOKE_INTERRUPT_HV_PRIV) > > Please make this a switch() - that's easier to read. > > > + emulated = kvmppc_get_last_inst(vcpu, false, &last_inst); > > + > > local_irq_enable(); > > > > trace_kvm_exit(exit_nr, vcpu); > > @@ -895,6 +906,26 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > > run->exit_reason = KVM_EXIT_UNKNOWN; > > run->ready_for_interrupt_injection = 1; > > > > + switch (emulated) { > > + case EMULATE_AGAIN: > > + r = RESUME_GUEST; > > + goto out; > > + > > + case EMULATE_FAIL: > > + pr_debug("%s: emulation at %lx failed (%08x)\n", > > + __func__, vcpu->arch.pc, last_inst); > > + /* For debugging, encode the failing instruction and > > + * report it to userspace. */ > > + run->hw.hardware_exit_reason = ~0ULL << 32; > > + run->hw.hardware_exit_reason |= last_inst; > > + kvmppc_core_queue_program(vcpu, ESR_PIL); > > + r = RESUME_HOST; > > + goto out; > > + > > + default: > > + break; > > + } > > I think you can just put this into a function. > > Scott, I think the patch overall looks quite good. Can you please check > as well and if you agree give it your reviewed-by? Mike, when Scott > gives you a reviewed-by, please include it for the next version. > > > Alex > > > + > > switch (exit_nr) { > > case BOOKE_INTERRUPT_MACHINE_CHECK: > > printk("MACHINE CHECK: %lx\n", mfspr(SPRN_MCSR)); > > @@ -1184,6 +1215,7 @@ int kvmppc_handle_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > > BUG(); > > } > > > > +out: > > /* > > * To avoid clobbering exit_reason, only check for signals if we > > * aren't already exiting to userspace for some other reason. > > diff --git a/arch/powerpc/kvm/bookehv_interrupts.S b/arch/powerpc/kvm/bookehv_interrupts.S > > index 6ff4480..e000b39 100644 > > --- a/arch/powerpc/kvm/bookehv_interrupts.S > > +++ b/arch/powerpc/kvm/bookehv_interrupts.S > > @@ -121,38 +121,14 @@ > > 1: > > > > .if \flags & NEED_EMU > > - /* > > - * This assumes you have external PID support. > > - * To support a bookehv CPU without external PID, you'll > > - * need to look up the TLB entry and create a temporary mapping. > > - * > > - * FIXME: we don't currently handle if the lwepx faults. PR-mode > > - * booke doesn't handle it either. Since Linux doesn't use > > - * broadcast tlbivax anymore, the only way this should happen is > > - * if the guest maps its memory execute-but-not-read, or if we > > - * somehow take a TLB miss in the middle of this entry code and > > - * evict the relevant entry. On e500mc, all kernel lowmem is > > - * bolted into TLB1 large page mappings, and we don't use > > - * broadcast invalidates, so we should not take a TLB miss here. > > - * > > - * Later we'll need to deal with faults here. Disallowing guest > > - * mappings that are execute-but-not-read could be an option on > > - * e500mc, but not on chips with an LRAT if it is used. > > - */ > > - > > - mfspr r3, SPRN_EPLC /* will already have correct ELPID and EGS */ > > PPC_STL r15, VCPU_GPR(R15)(r4) > > PPC_STL r16, VCPU_GPR(R16)(r4) > > PPC_STL r17, VCPU_GPR(R17)(r4) > > PPC_STL r18, VCPU_GPR(R18)(r4) > > PPC_STL r19, VCPU_GPR(R19)(r4) > > - mr r8, r3 > > PPC_STL r20, VCPU_GPR(R20)(r4) > > - rlwimi r8, r6, EPC_EAS_SHIFT - MSR_IR_LG, EPC_EAS > > PPC_STL r21, VCPU_GPR(R21)(r4) > > - rlwimi r8, r6, EPC_EPR_SHIFT - MSR_PR_LG, EPC_EPR > > PPC_STL r22, VCPU_GPR(R22)(r4) > > - rlwimi r8, r10, EPC_EPID_SHIFT, EPC_EPID > > PPC_STL r23, VCPU_GPR(R23)(r4) > > PPC_STL r24, VCPU_GPR(R24)(r4) > > PPC_STL r25, VCPU_GPR(R25)(r4) > > @@ -162,10 +138,15 @@ > > PPC_STL r29, VCPU_GPR(R29)(r4) > > PPC_STL r30, VCPU_GPR(R30)(r4) > > PPC_STL r31, VCPU_GPR(R31)(r4) > > - mtspr SPRN_EPLC, r8 > > - isync > > - lwepx r9, 0, r5 > > - mtspr SPRN_EPLC, r3 > > + > > + /* > > + * We don't use external PID support. lwepx faults would need to be > > + * handled by KVM and this implies aditional code in DO_KVM (for > > + * DTB_MISS, DSI and LRAT) to check ESR[EPID] and EPLC[EGS] which > > + * is too intrusive for the host. Get last instuction in > > + * kvmppc_get_last_inst(). > > + */ > > + li r9, KVM_INST_FETCH_FAILED > > stw r9, VCPU_LAST_INST(r4) > > .endif > > > > diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c > > index f692c12..0528fe5 100644 > > --- a/arch/powerpc/kvm/e500_mmu_host.c > > +++ b/arch/powerpc/kvm/e500_mmu_host.c > > @@ -606,10 +606,103 @@ void kvmppc_mmu_map(struct kvm_vcpu *vcpu, u64 eaddr, gpa_t gpaddr, > > } > > } > > > > +#ifdef CONFIG_KVM_BOOKE_HV > > int kvmppc_load_last_inst(struct kvm_vcpu *vcpu, bool prev, u32 *instr) > > { > > + gva_t geaddr; > > + hpa_t addr; > > + hfn_t pfn; > > + hva_t eaddr; > > + u32 mas0, mas1, mas2, mas3; > > + u64 mas7_mas3; > > + struct page *page; > > + unsigned int addr_space, psize_shift; > > + bool pr; > > + unsigned long flags; > > + > > + WARN_ON_ONCE(prev); What are the semantics of "prev"? > > + /* Search TLB for guest pc to get the real address */ > > + geaddr = kvmppc_get_pc(vcpu); > > + > > + addr_space = (vcpu->arch.shared->msr & MSR_IS) >> MSR_IR_LG; > > + > > + local_irq_save(flags); > > + mtspr(SPRN_MAS6, (vcpu->arch.pid << MAS6_SPID_SHIFT) | addr_space); > > + mtspr(SPRN_MAS5, MAS5_SGS | vcpu->kvm->arch.lpid); > > + asm volatile("tlbsx 0, %[geaddr]\n" : : > > + [geaddr] "r" (geaddr)); > > + mtspr(SPRN_MAS5, 0); > > + mtspr(SPRN_MAS8, 0); > > + mas0 = mfspr(SPRN_MAS0); > > + mas1 = mfspr(SPRN_MAS1); > > + mas2 = mfspr(SPRN_MAS2); > > + mas3 = mfspr(SPRN_MAS3); > > + mas7_mas3 = (((u64) mfspr(SPRN_MAS7)) << 32) | mas3; > > + local_irq_restore(flags); Where is mas0 used? On 64-bit you could read directly from SPRN_MAS7_MAS3. > > + /* > > + * If the TLB entry for guest pc was evicted, return to the guest. > > + * There are high chances to find a valid TLB entry next time. > > + */ > > + if (!(mas1 & MAS1_VALID)) > > + return EMULATE_AGAIN; > > + > > + /* > > + * Another thread may rewrite the TLB entry in parallel, don't > > + * execute from the address if the execute permission is not set > > + */ > > + pr = vcpu->arch.shared->msr & MSR_PR; > > + if ((pr && !(mas3 & MAS3_UX)) || (!pr && !(mas3 & MAS3_SX))) { > > + pr_debug("Instuction emulation from a guest page\n" > > + "withot execute permission\n"); Don't split user-visible strings. s/withot/without/ Give more context in the message, such as __func__ and the address. > > + return EMULATE_FAIL; > > + } > > + > > + /* > > + * We will map the real address through a cacheable page, so we will > > + * not support cache-inhibited guest pages. Fortunately emulated > > + * instructions should not live there. > > + */ > > + if (mas2 & MAS2_I) { > > + pr_debug("Instuction emulation from cache-inhibited\n" > > + "guest pages is not supported\n"); > > + return EMULATE_FAIL; > > + } Mismatches in M or W are bad as well, but shouldn't the page_is_ram() protect us? Except when LRAT is used, but there we can't prevent mismatches anyway. > > + /* Get page size */ > > + psize_shift = MAS1_GET_TSIZE(mas1) + 10; > > + > > + /* Map a page and get guest's instruction */ > > + addr = (mas7_mas3 & (~0ULL << psize_shift)) | > > + (geaddr & ((1ULL << psize_shift) - 1ULL)); > > + pfn = addr >> PAGE_SHIFT; > > + > > + /* Guard us against emulation from devices area */ > > + if (unlikely(!page_is_ram(pfn))) { > > + pr_debug("Instruction emulation from non-RAM host\n" > > + "pages is not supported\n"); > > + return EMULATE_FAIL; > > + } > > + > > + if (unlikely(!pfn_valid(pfn))) { > > + pr_debug("Invalid frame number\n"); > > + return EMULATE_FAIL; > > + } Can we ever have page_is_ram(pfn) && !pfn_valid(pfn)? -Scott ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-06-13 21:57 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-06-02 15:50 [PATCH 0/4 v3] KVM: PPC: Read guest instruction from kvmppc_get_last_inst() Mihai Caraman 2014-06-02 15:50 ` [PATCH 1/4 v3] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman 2014-06-02 15:50 ` [PATCH 2/4] KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Mihai Caraman 2014-06-02 15:50 ` [PATCH 3/4 v3] KVM: PPC: Alow kvmppc_get_last_inst() to fail Mihai Caraman 2014-06-12 15:15 ` Alexander Graf 2014-06-02 15:50 ` [PATCH 4/4 v3] KVM: PPC: Bookehv: Get vcpu's last instruction for emulation Mihai Caraman 2014-06-12 16:04 ` Alexander Graf 2014-06-13 21:56 ` Scott Wood
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).